ONLINE2 corpus (internally subdivided into two source corpora ONLINE2_NOW and ONLINE2_ARCHIVE) is a monitor corpus of the dynamic content of the Czech web, i.e. internet journalism. The span of the corpus is since April 2021 till the present.
It has been created at the CNC with the help of the data kindly provided by the Monitora company. Both parts of the corpus differ in their extent and periodicity of updates: ONLINE2_NOW contains daily updates from the current month plus 6 preceding months and is updated daily, ONLINE2_ARCHIVE contains data since April 2021 until the date when ONLINE_NOW begins and is updated every month.