Package 'statnipokladna' reference manual

Title:	Use Data from the Czech Public Finance Database
Description:	Get programmatic access to data from the Czech public budgeting and accounting database, Státní pokladna <https://monitor.statnipokladna.cz/>.
Authors:	Petr Bouchal [aut, cre]
Maintainer:	Petr Bouchal <[email protected]>
License:	MIT + file LICENSE
Version:	0.7.4
Built:	2025-01-30 04:53:20 UTC
Source:	https://github.com/petrbouchal/statnipokladna

Deprecated: Add codelist data to downloaded data

Description

Deprecated, use sp_add_codelist() instead.

Usage

add_codelist(
  data,
  codelist = NULL,
  period_column = .data$vykaz_date,
  redownload = FALSE,
  dest_dir = NULL
)
add_codelist(
  data,
  codelist = NULL,
  period_column = .data$vykaz_date,
  redownload = FALSE,
  dest_dir = NULL
)

Arguments

`data`	a data frame returned by `sp_get_table()`.
`codelist`	The codelist to add. Either a character vector of length one (see `sp_tables` for possible values), or a data frame returned by `sp_get_codelist()`.
`period_column`	Unquoted column name of column identifying the data period in `data`. Leave to default if you have not changed the `data` object returned by `sp_get_table()`.
`redownload`	Redownload even if file has already been downloaded? Defaults to FALSE.
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.

Details

Value

A tibble of same length as data, with added columns from codelist. See Details.

Deprecated: Get codelist

Description

Deprecated: use sp_get_codelist()

Usage

get_codelist(codelist_id, n = NULL, dest_dir = NULL, redownload = FALSE)
get_codelist(codelist_id, n = NULL, dest_dir = NULL, redownload = FALSE)

Arguments

`codelist_id`	A codelist ID. See `id` column in `sp_codelists` for a list of available codelists.
`n`	Number of rows to return. Default (NULL) means all. Useful for quickly inspecting a codelist.
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.
`redownload`	Redownload even if file has already been downloaded? Defaults to FALSE.

Details

Value

A tibble

Add codelist data to downloaded data

Description

Joins a provided codelist, or downloads and processes one if necessary, and adds it to the data.

Usage

sp_add_codelist(
  data,
  codelist = NULL,
  period_column = .data$vykaz_date,
  by = NULL,
  redownload = FALSE,
  dest_dir = NULL
)
sp_add_codelist(
  data,
  codelist = NULL,
  period_column = .data$vykaz_date,
  by = NULL,
  redownload = FALSE,
  dest_dir = NULL
)

Arguments

`data`	a data frame returned by `sp_get_table()`.
`codelist`	The codelist to add. Either a character vector of length one (see `sp_tables` for possible values), or a data frame returned by `sp_get_codelist()`.
`period_column`	Unquoted column name of column identifying the data period in `data`. Leave to default if you have not changed the `data` object returned by `sp_get_table()`.
`by`	character. Columns by which to join the codelist. Same form as for ⁠dplyr::left_join()``.⁠.
`redownload`	Redownload even if file has already been downloaded? Defaults to FALSE.
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.

Details

The data argument should be a data frame produced by sp_get_table() If this is true, the period_column argument is not needed. The codelist argument, if a data frame, should be a data frame produced by sp_get_codelist(). Specifically, it assumes it contains the following columns:

start_date, a date
end_date, a date
column with the code, character usually named the same as the codelist

#' You can usually tell which codelist you need from the name of the column whose code you are looking to expand, e.g. the codes in column paragraf can be expanded by codelist paragraf.

The function filters the codelist to obtain a set of entries relevant to the time period of data. If data contains tables for multiple periods, this is handled appropriately. Codelist-originating columns in the resulting data frame are renamed so they do not interfere with joining additional codelists, perhaps in a single pipe call.

Note that some codelists are "secondary" and can only be joined onto other codelists. If a codelist does not join using sp_add_codelis(), store the output of sp_get_codelist() and join it manually using dplyr.

Value

A tibble of same length as data, with added columns from codelist. See Details.

Examples

## Not run: 
sp_get_table("budget-central", 2017) %>%
  sp_add_codelist("polozka") %>%
  sp_add_codelist("paragraf")

pol <- sp_get_codelist("paragraf")
par <- sp_get_codelist("polozka")

sp_get_table("budget-central", 2017) %>%
  sp_add_codelist(pol) %>%
  sp_add_codelist(par)

## End(Not run)
## Not run: 
sp_get_table("budget-central", 2017) %>%
  sp_add_codelist("polozka") %>%
  sp_add_codelist("paragraf")

pol <- sp_get_codelist("paragraf")
par <- sp_get_codelist("polozka")

sp_get_table("budget-central", 2017) %>%
  sp_add_codelist(pol) %>%
  sp_add_codelist(par)

## End(Not run)

List of available codelists

Description

Contains IDs and names of all (most) available codelists that can be retrieved by sp_get_codelist.

Usage

sp_codelists
sp_codelists

Format

A data frame with 27 rows and 2 variables:

id: character. ID, used as codelist_id argument in sp_get_codelist.
name: character. Short name, mostly corresponds to title used on statnipokladna.cz.

Details

The id is to be used as the codelist_id parameter in sp_get_codelist. See https://monitor.statnipokladna.cz/datovy-katalog/ciselniky for a more detailed descriptions and a GUI for exploring the lists.

List of available datasets

Description

Contains IDs and names of all available datasets that can be retrieved by get_dataset.

Usage

sp_datasets
sp_datasets

Format

A data frame with 9 rows and 3 variables:

id: character. Dataset ID, used as dataset_id argument to sp_get_dataset.
name: character. Dataset name, mostly corresponds to title on the statnipokladna GUI.

Details

See https://monitor.statnipokladna.cz/datovy-katalog/transakcni-data for a more detailed descriptions of the datasets.

Get codelist

Description

Downloads and processes codelist identified by codelist_id. See sp_codelists for a list of of available codelists with their IDs and names.

Usage

sp_get_codelist(codelist_id, n = NULL, dest_dir = NULL, redownload = FALSE)
sp_get_codelist(codelist_id, n = NULL, dest_dir = NULL, redownload = FALSE)

Arguments

`codelist_id`	A codelist ID. See `id` column in `sp_codelists` for a list of available codelists.
`n`	Number of rows to return. Default (NULL) means all. Useful for quickly inspecting a codelist.
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.
`redownload`	Redownload even if file has already been downloaded? Defaults to FALSE.

Details

You can usually tell which codelist you need from the name of the column whose code you are looking to expand, e.g. the codes in column paragraf can be expanded by codelist paragraf.

The processing ensures that the resulting codelist can be correctly joined to the data, automatically using sp_add_codelist() or manually. The entire codelist is downloaded and not filtered for any particular date.

Codelist XML files are stored in a temporary directory as determined by tempdir() and persist per session to avoid redownloads.

Value

a tibble

Examples

## Not run: 
sp_get_codelist("paragraf")

## End(Not run)
## Not run: 
sp_get_codelist("paragraf")

## End(Not run)

Download a codelist XML file

Description

This is normally called inside sp_get_codelist() but can be used separately if finer-grained control of intermediate outputs is needed, e.g. in a {targets} workflow.

Usage

sp_get_codelist_file(
  codelist_id = NULL,
  url = NULL,
  dest_dir = NULL,
  redownload = FALSE
)
sp_get_codelist_file(
  codelist_id = NULL,
  url = NULL,
  dest_dir = NULL,
  redownload = FALSE
)

Arguments

`codelist_id`	A codelist ID. See `id` column in `sp_codelists` for a list of available codelists.
`url`	DESCRIPTION. Either this or `codelist_id` must be set. If both are set, `url` wins.
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.
`redownload`	Redownload even if file has already been downloaded? Defaults to FALSE.

Value

path to XML file; character vector of length one.

Examples

## Not run: 
sp_get_codelist_file("druhuj")
codelist_url <- sp_get_codelist_url("druhuj")
sp_get_codelist_file(url = codelist_url)

## End(Not run)
## Not run: 
sp_get_codelist_file("druhuj")
codelist_url <- sp_get_codelist_url("druhuj")
sp_get_codelist_file(url = codelist_url)

## End(Not run)

Get URL of a given codelist

Description

This is normally called inside sp_get_codelist() but can be used separately if finer-grained control of intermediate outputs is needed, e.g. in a {targets} workflow.

Usage

sp_get_codelist_url(codelist_id, check_if_exists = TRUE)
sp_get_codelist_url(codelist_id, check_if_exists = TRUE)

Arguments

`codelist_id`	DESCRIPTION.
`check_if_exists`	Whether to check that the URL works (HTTP 200).

Value

character vector of length one containing URL

Examples

## Not run: 
sp_get_codelist_url("ucjed", FALSE)
if(FALSE) sp_get_codelist_url("ucjed_wrong", TRUE) # fails, invalid codelist

## End(Not run)
## Not run: 
sp_get_codelist_url("ucjed", FALSE)
if(FALSE) sp_get_codelist_url("ucjed_wrong", TRUE) # fails, invalid codelist

## End(Not run)

Retrieve dataset from statnipokladna

Description

Downloads ZIP archives for a given dataset. If year or month have length > 1, gets all combinations.

Usage

sp_get_dataset(
  dataset_id,
  year,
  month = 12,
  dest_dir = NULL,
  redownload = FALSE
)
sp_get_dataset(
  dataset_id,
  year,
  month = 12,
  dest_dir = NULL,
  redownload = FALSE
)

Arguments

`dataset_id`	A dataset ID. See `id` column in `sp_datasets` for a list of available datasets
`year`	year, numeric vector of length <= 1 (can take multiple values), 2015-2019 for some datasets, 2010-2020 for others. Defaults to 2018. (see Details for how to work with data across time periods.)
`month`	month, numeric vector of length <= 1 (can take multiple values). Must be between 1 and 12. Defaults to 12. (see Details for how to work with data across time periods.)
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.
`redownload`	Redownload even if file has already been downloaded? Defaults to FALSE.

Details

Files are stored in a temp folder as determined by tempdir() or the dest_dir param or the statnipokladna.dest_dir option. and further sorted into subdirectories by dataset, year and month. If saved to tempdir() (the default), downloaded files per session to avoid redownloads.

How data for different time periods is exported differs by dataset. This has significant implications for how you get to usable full-year numbers or time series in different tables. See vignette("statnipokladna") for details on this.

Value

character string with complete paths to downloaded ZIP archives.

Examples

## Not run: 
budget_2018 <- sp_get_dataset("finm", 2018)
budget_mid2018 <- sp_get_dataset("finm", 2018, 6)

## End(Not run)
## Not run: 
budget_2018 <- sp_get_dataset("finm", 2018)
budget_mid2018 <- sp_get_dataset("finm", 2018, 6)

## End(Not run)

Get dataset documentation

Description

Downloads XLS file with dataset documentation, or opens link to this file in browser.

Usage

sp_get_dataset_doc(dataset_id, dest_dir = NULL, download = TRUE)
sp_get_dataset_doc(dataset_id, dest_dir = NULL, download = TRUE)

Arguments

`dataset_id`	dataset ID. See `sp_datasets`.
`dest_dir`	character. Directory in which downloaded files will be stored. If left unset, will use the `statnipokladna.dest_dir` option if the option is set, and `tempdir()` otherwise. Will be created if it does not exist.
`download`	Whether to download (the default) or open link in browser.

Value

(invisible) path to file if download = TRUE, URL otherwise

Examples

## Not run: 
sp_get_dataset_doc("finm")

## End(Not run)
## Not run: 
sp_get_dataset_doc("finm")

## End(Not run)

Get URL of dataset

Description

Useful for workflows where you want to keep track of URLs and intermediate files, rather than having all steps performed by one function.

Usage

sp_get_dataset_url(dataset_id, year, month = 12, check_if_exists = TRUE)
sp_get_dataset_url(dataset_id, year, month = 12, check_if_exists = TRUE)

Arguments

`dataset_id`	Dataset ID. See `id` column in `sp_datasets` for a list of available codelists.
`year`	year, numeric vector of length <= 1 (can take multiple values), 2015-2019 for some datasets, 2010-2020 for others. (see Details for how to work with data across time periods.)
`month`	month, numeric vector of length <= 1 (can take multiple values). Must be between 1 and 12. Defaults to 12. (see Details for how to work with data across time periods.)
`check_if_exists`	Whether to check that the URL works (HTTP 200).

Value

a character vector of length one, containing a URL

Examples

## Not run: 
sp_get_dataset_url("finm", 2018, 6, FALSE)
sp_get_dataset_url("finm", 2029, 6, FALSE) # works but returns invalid URL
if(FALSE) sp_get_dataset_url("finm_wrong", 2018, 6, TRUE) # fails, invalid dataset ID
if(FALSE) sp_get_dataset_url("finm", 2022, 6, TRUE) # fails, invalid time period

## End(Not run)
## Not run: 
sp_get_dataset_url("finm", 2018, 6, FALSE)
sp_get_dataset_url("finm", 2029, 6, FALSE) # works but returns invalid URL
if(FALSE) sp_get_dataset_url("finm_wrong", 2018, 6, TRUE) # fails, invalid dataset ID
if(FALSE) sp_get_dataset_url("finm", 2022, 6, TRUE) # fails, invalid time period

## End(Not run)

Get path to a CSV file containing a table.

Description

This is normally called inside sp_get_table() but can be used separately if finer-grained control of intermediate outputs is needed, e.g. in a {targets} workflow.

Usage

sp_get_table_file(table_id, dataset_path, reunzip = FALSE)
sp_get_table_file(table_id, dataset_path, reunzip = FALSE)

Arguments

`table_id`	Table ID; see `id` column in `sp_tables` for a list of available codelists.
`dataset_path`	Path to downloaded dataset, as output by `sp_get_dataset()`
`reunzip`	Whether to overwrite existing CSV files by unzipping the archive downlaoded by `sp_get_dataset()`. Defaults to FALSE.

Value

Character vector of length one - a path.

Examples

## Not run: 
ds <- sp_get_dataset("rozv", 2018, 12)
sp_get_table_file("balance-sheet", ds)

## End(Not run)
## Not run: 
ds <- sp_get_dataset("rozv", 2018, 12)
sp_get_table_file("balance-sheet", ds)

## End(Not run)

Load codelist into a tibble from XML file

Description

This is normally called inside sp_get_codelist() but can be used separately if finer-grained control of intermediate outputs is needed, e.g. in a {targets} workflow.

Usage

sp_load_codelist(path, n = NULL)
sp_load_codelist(path, n = NULL)

Arguments

`path`	Path to a file as returned by `sp_get_codelist_file()`
`n`	Number of rows to return. Default (NULL) means all. Useful for quickly inspecting a codelist.

Value

a tibble

Examples

## Not run: 
cf <- sp_get_codelist_file("druhuj")
sp_load_codelist(cf)

## End(Not run)
## Not run: 
cf <- sp_get_codelist_file("druhuj")
sp_load_codelist(cf)

## End(Not run)

Load a statnipokladna table from a CSV file

Description

This is normally called inside sp_get_table() but can be used separately if finer-grained control of intermediate outputs is needed, e.g. in a {targets} workflow.

Usage

sp_load_table(path, ico = NULL)
sp_load_table(path, ico = NULL)

Arguments

`path`	path to a CSV file, as output by `sp_get_table_file()`.
`ico`	Organisation ID to filter by, if supplied.

Value

a tibble. See help for sp_get_table() for a key to the columns.

Examples

## Not run: 
ds <- sp_get_dataset("rozv", 2018, 12)
tf <- sp_get_table_file("balance-sheet", ds)
sp_load_table(tf)

## End(Not run)
## Not run: 
ds <- sp_get_dataset("rozv", 2018, 12)
tf <- sp_get_table_file("balance-sheet", ds)
sp_load_table(tf)

## End(Not run)

List of available tables (PARTIAL)

Description

Contains IDs and names of all available tables that can be retrieved by sp_get_table. Look inside the XLS documentation for each dataset at https://monitor.statnipokladna.cz/datovy-katalog/transakcni-data to see more detailed descriptions. Note that tables do not correspond to the tabulka/vtab attribute of the tables, they represent files inside datasets.

Usage

sp_tables
sp_tables

Format

A data frame with 2 rows and 4 variables:

id: character Table id, used as table_id argument to sp_get_table.
dataset_id: integer Table number.
czech_name: character Czech name of the table.
note: character Note.

Package 'statnipokladna'

Help Index

Deprecated: Add codelist data to downloaded data

Description

Usage

Arguments

Details

Value

See Also

Deprecated: Get codelist

Description

Usage

Arguments

Details

Value

See Also

Add codelist data to downloaded data

Description

Usage

Arguments

Details

Value

See Also

Examples

List of available codelists

Description

Usage

Format

Details

See Also

List of available datasets

Description

Usage

Format

Details

See Also

Get codelist

Description

Usage

Arguments

Details

Value

See Also

Examples

Download a codelist XML file

Description

Usage

Arguments

Value

See Also

Examples

Get URL of a given codelist

Description

Usage

Arguments

Value

See Also

Examples

Retrieve dataset from statnipokladna

Description

Usage

Arguments

Details

Value

See Also

Examples

Get dataset documentation

Description

Usage

Arguments

Value

See Also

Examples

Get URL of dataset

Description

Usage

Arguments

Value

See Also

Examples