Back to Tools

Glossary and Monitor Logic

A plain-English reference for the words and decisions used inside the Mozilla Media Monitor. The goal is to make the review workflow inspectable, so analysts can see where a link was missed, filtered, scraped, or kept.

Source Tiers

Verified

A source that has been reviewed and is trusted enough to appear in final report workflows when the article itself is relevant.

Discovery

A useful source that should stay visible for learning and coverage checks, but is not yet treated as a final-report source.

Watchlist

A source worth monitoring, but handled cautiously because it may be noisy, indirect, syndicated, financial, job-board-like, or special-purpose.

Unclassified

A publisher that has not yet been manually assigned to one of the three source tiers.

A source tier is not an article verdict. A Verified source can publish irrelevant articles, and an Unclassified source can publish something important. The tier tells the app how to prioritize source review and report-building confidence.

Compare Page Terms

Human report links

The list of URLs pasted in by an analyst from the manually built report.

Caught by monitor

The pasted URL matched a record the monitor already stored in the selected time window.

Not caught

The pasted URL had no matching monitor record. This usually points to a source coverage gap, timing gap, or URL matching gap.

Monitor-only candidate

The monitor kept an article that was not in the pasted human report links. These are candidates for analyst review.

Filtered before human review

The monitor saw the link, but an intentional rule filtered it out before it entered the normal human review queue.

Scrape alert

The monitor found the link but could not read enough article text. The link may be important, but the app has less evidence.

How A Link Moves Through The Monitor

  1. The monitor reads configured feeds and searches from data/feeds.csv.
  2. It normalizes each URL so duplicate links are easier to recognize.
  3. It skips links already seen unless they previously failed in a way that should be retried.
  4. It tries to scrape the article title, date, full text, publisher, and metadata.
  5. It applies conservative guardrails for obvious non-article or non-Mozilla material.
  6. It sends remaining candidates through relevance review and stores the outcome in Supabase.
  7. The dashboards read those stored records and show what was kept, filtered, missed, or found only by the monitor.

What Analysts Can Change

Analysts can add feeds, block noisy publishers, move publishers between Verified, Discovery, and Watchlist, mark articles reviewed, and use the compare page to identify source gaps. Those actions improve coverage without asking the AI to silently hide more material.

The safest operating rule is: improve what the monitor sees first, improve scraping second, and only then tighten automatic filtering. That keeps the app from overreaching while the team is still learning where coverage gaps come from.