Wiki source code of SMEAR database

Last modified by Pasi Kolari on 2024/02/09 11:55

Show last authors
1 === **Introduction** ===
2
3 Processed data from continuous measurements is stored in MySQL database hosted by CSC. The data are located in database 'smear' which contains several tables from each station.
4
5 [[SmartSMEAR>>url:https://smear.avaa.csc.fi/||rel=" noopener noreferrer" shape="rect" target="_blank"]] web user interface provides easy access and visualization of the data.
6
7 SmartSMEAR also provides [[API>>url:https://smear-backend.rahtiapp.fi/q/openapi-ui/||rel=" noopener noreferrer" shape="rect" target="_blank"]] for scripted queries. This is the recommended way to retrieve data if you want to set up automated visualization or processing of data or just download the data "on the fly" without saving intermediate files. The UI might not show all database tables, you must use API or retrieve data directly from the database. See instructions and metadata descriptions further below, on the API documentation page and check some [[sample scripts>>doc:SMEAR.The SMEAR Wikispace.SMEAR database.Using SMEAR API.WebHome]] to get started.
8
9 General questions on data availability, terms of use etc. contact [[atm-data@helsinki.fi.>>mailto:atm-data@helsinki.fi||shape="rect"]] It's basically a mailing list with couple of "data-aware" people.
10 More detailed [[list of contact persons>>doc:SMEAR.The SMEAR Wikispace.SMEAR database.Data contact persons.WebHome]]
11
12 Data from FMI weather stations in Värriö, Hyytiälä and Kumpula can be downloaded from [[FMI download service>>url:https://en.ilmatieteenlaitos.fi/download-observations||rel=" noopener noreferrer" shape="rect" target="_blank"]] or using [[FMI open data API>>doc:SMEAR.The SMEAR Wikispace.SMEAR database.FMI data.WebHome]].
13
14
15 === **[[Software configuration for direct database access and SQL queries>>doc:SMEAR.The SMEAR Wikispace.SMEAR database.Setting up database connection.WebHome]]** ===
16
17 === **[[doc:SMEAR.The SMEAR Wikispace.SMEAR database.Data flow.WebHome]]** ===
18
19
20 === **Database structure and metadata** ===
21
22 All data are located in database 'smear' which contains several tables from each stations, e.g. HYY_META for Hyytiälä SMEAR II meteorological and gas data, SII1_EDDY for Siikaneva1 eddy fluxes.
23
24 An easy way to find out the names and locations of the variables in the database (that is, table and column names) is using AVAA:
25
26 1. Tooltips on the variable lists show COLUMN:[column name]:[table name] for each variable.
27 1. Output file headers identify the variables as [table name].[column name]
28
29 Table 'VariableMetadata' contains the column names ('variable'), locations ('tableID'), descriptions, units, source instruments etc. for (almost) all variables in the database. 'TableMetadata' is the key to table ID's in VariableMetadata, it also contains basic station information of the corresponding tables.
30
31 [[Download variable metadata in .csv format>>url:http://www.atm.helsinki.fi/pics/databaseVariableMetadata.csv||shape="rect"]]
32
33 [[Download table metadata in .csv format>>url:http://www.atm.helsinki.fi/pics/databaseTableMetadata.csv||shape="rect"]]
34
35 Note that the attached files are UTF-8 encoded, MS Excel will not show the contents correct. Open for instance with Notepad++ using some Unicode font.
36
37 You can also access the metadata via the [[API>>url:https://smear-backend.rahtiapp.fi/swagger-ui||shape="rect"]].
38
39 If you have database connection open, you can take a look at metadata contents using MySQL command
40
41 {{{SELECT * FROM smear.VariableMetadata;}}}
42
43 Example of more focused MySQL queries for table and column names:
44
45 {{{SELECT tableID, variable FROM smear.VariableMetadata WHERE title LIKE `Air temperatur%`;}}}
46
47 Field 'variable' is the column name. Table names must be parsed from 'tableID' values, for example tableID 2:
48
49 {{{SELECT name FROM smear.TableMetadata WHERE tableID=2;}}}
50
51 Table 'Events' contains descriptions of changes in measurement and data processing. Events are linked to the actual data variables in 'variableEvents' table.
52
53 'Tags' are standard (e.g. NetCDF CF) variable names and linked with actual variables with 'variableTags' table.
54
55 [[More detailed description of metadata tables>>doc:SMEAR.The SMEAR Wikispace.SMEAR database.Database schema.WebHome]].
56
57
58 === **Data processing levels** ===
59
60 Most variables in the database are calculated and inserted to the database near real time with only rudimentary automated quality check. Later they are updated with data processed and checked by the responsible researchers. The quality or processing level is indicated as [variable name]_EMEP column. Level 1 refers to online processed data and 2 for quality checked data. Please note that the checking is not an absolute proof of high quality. There's limit how much human expert can improve the data if the measurement is technically bad. On the other hand, in some cases (for instance basic meteorology) online-processed data are often almost as good as final checked data. Some variables like eddy-covariance fluxes have additional quality flags that describe the accuracy or representativeness of each observation.
61
62
63 === **Miscellaneous** ===
64
65 Timestamps in the data are Finnish normal time (UTC+2), beginning of aggregation interval if the observation was not instantaneous. If you communicate directly with the database and retrieve the timestamps as datetime (instead of character strings), take care that the database server and your software do not make their own interpretations about the time zone. It's good idea to set session time zone to UTC if there's such risk.
66
67 Basic time step in all tables except flux data (*_EDDY* tables) is one minute. One record can be instantaneous observation (most met & soil measurements) or average or accumulation over 1 min (precipitation, runoff) or 30 min (fluxes).
68
69 Not all variables are measured every minute, in such case there is empty value field (NULL) in the table.
70
71 You can apply basic math, for instance time-averaging, to the data within the database engine using SQL commands but in some cases you better first download 1-min data and do the math yourself:
72
73 (% class="O1" %)
74 (((
75 – wind direction: vector mean requires fairly complex SQL scripting
76 )))
77
78 (% class="O1" %)
79 (((
80 – is it reasonable to calculate time average that contains just one 1 min observation and the rest are NULL?
81 )))
82
83 Time step of one minute means that the amount of data in the database is huge! Don’t try to select everything at once!
84
85
86 === **Data descriptions** ===
87
88 [[Värriö>>doc:SMEAR.The SMEAR Wikispace.SMEAR I (Värriö).Värriö database.WebHome]]
89
90 (% class="confluence-link" %)Hyytiälä
91
92 (% style="margin-left: 30.0px;" %)
93 [[SMEAR II>>doc:SMEAR.The SMEAR Wikispace.SMEAR II (Hyytiälä).SMEAR II Data.Hyytiälä database.WebHome]]
94
95 (% style="margin-left: 30.0px;" %)
96 Siikaneva 1 & 2
97
98 (% style="margin-left: 30.0px;" %)
99 Lake Kuivajärvi
100
101 [[Helsinki>>doc:SMEAR.The SMEAR Wikispace.SMEAR III (Helsinki).Helsinki database.WebHome]]
102
103 (% style="margin-left: 30.0px;" %)
104 Kumpula
105
106 (% style="margin-left: 30.0px;" %)
107 Hotel Torni
108
109 (% style="margin-left: 30.0px;" %)
110 Erottaja Fire station