Editorial

Methodology & Data Sources

How the catalogue is assembled — IMDb non-commercial datasets, Wikidata SPARQL, manual curation, and the join logic that ties them together.

Primary source — IMDb non-commercial datasets

The catalogue’s spine is the IMDb non-commercial dataset bundle published at datasets.imdbws.com. We download title.principals, title.basics, title.ratings and title.akas, filter title.principals for nconst nm0462050, then join the resulting tconst list against the other three tables. This yields 56 non-episode credits with character names, runtime, genre, IMDb rating and original Japanese titles.

Enrichment — Wikidata (CC0)

For director attributions and Japanese-language titles missing from IMDb’s title.akas, we run a SPARQL query against Wikidata entity Q265091 (Satomi Kobayashi) joining films via property P161 (cast member) and P57 (director). The query and full result set is reproducible from the SPARQL endpoint at query.wikidata.org.

Awards table

Wikidata exposes only one P166 statement for Q265091; the broader awards index is hand-curated from publicly documented jury announcements (Japan Academy Film Prize, Blue Ribbon Awards, Mainichi Film Concours, Kinema Junpo, Yokohama Film Festival, Hochi Film Award). Each entry cites the awarded title via its IMDb tconst.

Refresh cadence

The IMDb snapshot is regenerated daily. We re-import on demand via a guarded admin trigger; see Editorial Policy for the change log behaviour.

What we don’t store

No personally identifying viewer data, no third-party trackers, no scraped images. Posters and stills are placeholders generated locally.