Use wikidata to complete seeds

Initially, the news crawler was seeded with URLs from news sites from DMOZ, see #8 for the procedure. DMOZ isn't updated anymore, but [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) could be a replacement to complete the seed list:
- select all instances of [newspaper](https://www.wikidata.org/wiki/Q11032) ([news media](https://www.wikidata.org/wiki/Q1193236), or similar) having an [official website](https://www.wikidata.org/wiki/Property:P856):
  ```sparql
  SELECT DISTINCT ?item ?itemLabel ?lang ?url
  WHERE
  { 
    ?item wdt:P31/wdt:P279* wd:Q11032.
    ?item wdt:P856 ?url.  # with official website
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en,de,ru,fr,es,it,ja,zh,*" }
    OPTIONAL {
       ?item wdt:P407 ?language.
       ?language wdt:P220 ?lang.
     }
  }
  LIMIT 50
  ```
  ([execute query on Wikidata query service](https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fitem%20%3FitemLabel%20%3Flang%20%3Furl%0AWHERE%0A%7B%20%0A%20%20%3Fitem%20wdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ11032.%0A%20%20%3Fitem%20wdt%3AP856%20%3Furl.%20%20%23%20with%20official%20website%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2Cde%2Cru%2Cfr%2Ces%2Cit%2Cja%2Czh%2C%2a%22%20%7D%0A%20%20OPTIONAL%20%7B%0A%20%20%20%20%20%3Fitem%20wdt%3AP407%20%3Flanguage.%0A%20%20%20%20%20%3Flanguage%20wdt%3AP220%20%3Flang.%0A%20%20%20%7D%0A%7D%0ALIMIT%2050))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use wikidata to complete seeds #50

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use wikidata to complete seeds #50

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions