Skip to contents

downlaod_geo_sirene() is used has a wrapper of download_archive() to download easily geolocalized SIRENE database.

Usage

download_geo_sirene(
  destination,
  repository = c("cquest", "data.gouv", "other"),
  origin = NULL,
  name = NULL,
  extension = NULL,
  version = NULL,
  month = NULL,
  version_insee = c("2019", "2017"),
  scope = c("france", "department", "commune"),
  verbose = TRUE
)

Arguments

destination

character, the address where dara are stocked.

repository

character, keyword or the repository from where data are downloaded. Best to use given options. See details.

origin

character, url address from where data are downloaded. It is set when repository is something else than "other".

name

character, vector of acceptable names fo archive to be downloaded. See details.

extension

character, vector of acceptable types of archive to be downloaded.

version

string, version of geo_sirene to be downloaded. Useful to set departments or communes when repository is "data.gouv".

month

character, vector of month to be downloaded.

version_insee

string, INSEE version to be downloaded. Only used with origin = "cquest". See details.

scope

string, scope for download when repository is "data.gouv". See details.

verbose

logical, should the function send some messages while running.

Value

An invisible data.frame of log.

Details

If repository is set to "cquest", then all necessary variables are filled with those values:

  • origin is changed to https://data.cquest.org/geo_sirene. This url is adapted depending on version_insee. Some extra efforts are necessary to deal with version_insee and vintage of data.

  • version_insee can be set either to "2019" (default) or "2017". #' * month is used to set something equivalent to date, and to adapt origin. Some extra efforts is necessary depending on years because of the structure of the repository (some months are grouped by year, some not, and this evolves).

  • name default value is changed to "StockEtablissement_utf8" when version_insee is "2019", "etablissements_actifs" when version_insee is "2017" expect for old vintage for which it is changed to "geo_sirene".

  • extension default value is changed to "csv.gz".

If repository is set to "data.gouv", then all necessary variables are filled with those values:

  • origin is changed to https://files.data.gouv.fr/geo-sirene.

  • version_insee is set to "2019".

  • month is used to set something equivalent to date, and to adapt origin.

  • scope is used to access to available scopes for download (France, departments or communes). It is used to adapt origin.

  • name is set depending on scope: "StockEtablissementActif_utf8_geo" for "france", "geo_siret" for "department", NULL for "commune".

  • extension is set depending on scope: "csv.gz" for "france" and "department", "csv" for "commune".

If origin is set to "other", everything shall be filled so that download_archive() can make a successful download.

Examples

if (FALSE) {
dest = tempdir()
month = c("2019-01", "2020-01", "2021-01")
download_geo_sirene(dest)
download_geo_sirene(dest, month = month)
download_geo_sirene(dest, version_insee = "2017")
download_geo_sirene(dest, version_insee = "2017", month = "2018-01")

download_geo_sirene(dest, repository = "data.gouv")
month = c("2020-01", "2021-01", "2022-01", "2023-01", "2024-01")
log_sirene = download_geo_sirene(dest, "data.gouv", month = month)
log_sirene
download_geo_sirene(dest, "data.gouv", scope = "department", version = "34")
download_geo_sirene(dest, "data.gouv", scope = "commune", version = "34170")
unlink(destination)
}