Publication detail

Utilization of databases with missing data for classification of the EU regions

ODEHNAL, J. NEUBAUER J. MICHÁLEK, J.

Czech title

Použití neúplných datových souborů ke klasifikaci regionů EU

English title

Utilization of databases with missing data for classification of the EU regions

Type

Peer-reviewed article not indexed in WoS or Scopus

Language

cs

Original abstract

Použití neúplných datových souborů ke klasifikaci regionů EU. Empirická analýza.

English abstract

The paper deals with the clustering of 202 European NUTS 2 regions into groups with similar values of 22 economic variables. Data were obtained from the Eurostat Regional Yearbook 2007 and from the database Regional Statistics and they contain high number of missing values. The data analysis is primarily focused on filling missing values. Three methods for filling missing values were used and compared: filling by average, by median and by ZET algorithm described in [22]. The results of clustering are described by tables and by dendrogram. Further the comparison of the classification results with regard to the method of handling with missing data was performed. The conclusion is that the ZET algorithm is the suitable statistical technique for filling missing data in cosidered data files.

Keywords in Czech

chbějící data, ZET algoritmus, konkurenceschopnost, klasifikace regionů EU

Keywords in English

missing data, ZET algorithm, competitiveness, NUTS classification of EU regions

Released

2009-05-04

Publisher

Český statistický úřad

Location

Praha

ISSN

0322-788X

Journal

Statistika-Statistics and Economy Journal

Volume

2009

Number

5

Pages from–to

446–461

Pages count

16

BIBTEX


@article{BUT48217,
  author="Jaroslav {Michálek}",
  title="Použití neúplných datových souborů ke klasifikaci regionů EU",
  journal="Statistika-Statistics and Economy Journal",
  year="2009",
  volume="2009",
  number="5",
  pages="446--461",
  issn="0322-788X"
}