Similarity Detection
This guide explains the DrillBit Similarity PDF Report — the downloadable PDF document DrillBit generates after a plagiarism check. It tells you how much of your document overlaps with other sources, where each match comes from, and how to interpret the result. Every PDF report uses the same structure — the only thing that changes from one document to the next is the percentage, the grade letter, and the recommended next step. This guide explains the PDF report once, then summarises what is different at each grade.
The Four Similarity Grades
DrillBit converts the raw similarity percentage into a single-letter grade so you can read your result at a glance. Click each chip below to see how the same headline card looks at every grade — the layout never changes, only the colour, the numbers, the matched-source count, and the recommended action:
The four bands and their boundaries are fixed:
- A — Satisfactory (0–10%): Minimal overlap with external sources. The document is generally considered original and acceptable without further revision.
- B — Upgrade (11–40%): Moderate similarity. Some sections may need rephrasing or stronger citation before the document is finalised.
- C — Poor (41–60%): High similarity. Substantial portions of the document overlap with external sources and should be reworked before submission.
- D — Unacceptable (61–100%): Very high similarity. The document lacks originality and requires major revision.
How to Read a Similarity Report
The DrillBit similarity report is a multi-section PDF. The structure is the same on every report regardless of the result — only the numbers and colours change. Before diving into individual parameters, here is what the report looks like end to end as you scroll through it:
The report has three logical sections, in this order: a Cover Summary with all submission and result metadata, a Similarity Report Header with the headline numbers and the matched-sources table, and the Highlighted Document Content showing the original document with every match flagged inline. The rest of this guide walks through every parameter in each section, in the order they appear.
Cover Summary
The cover section is a single-glance summary of the document and the overall result. It is split into four logical blocks — Submission Information, Result Information, Exclude Information, and Database Selection — with a unique QR code in the corner. Each block is walked through below with its own animation.
Submission Information
This block records the identity of the document and the person who submitted it. Every field is captured at the moment of upload and cannot be edited afterwards. Watch the animation below as each field is explained:
- Author Name: The name entered for the document's author at the time of submission. For institutional submissions this is usually the student or staff member; for personal use it is the account holder.
- Title: The document title as supplied during upload. This appears on every downloaded copy of the report.
- Paper / Submission ID: A unique numeric identifier DrillBit assigns to the submission. Use this ID when raising a support ticket so the support team can locate the exact report.
- Submitted by: The email address of the account that uploaded the document. Useful when an instructor or admin reviews work submitted on behalf of others.
- Submission Date: The timestamp the document was received, recorded in
YYYY-MM-DD HH:MM:SSformat. - Total Pages, Total Words: A count of how much content was scanned. Both numbers are computed from the parsed document, not the raw file size.
- Document type: The kind of source DrillBit detected — for example Web Page, Article, Thesis, or Synopsis. The document type can affect which exclusion rules apply.
Result Information
This is where the headline number lives. The Similarity percentage represents the proportion of the document's content that matches one or more external sources after exclusions are applied. Watch the percentage count up to the final value, the scale strip fill in green, and both pie charts render with their segment labels:
Beneath the figure, a horizontal scale strip with markers at 1, 10, 20, 30, 40, 50, 60, 70, 80, and 90 visually places your score on a 0–100 axis. The indicator's position and colour reflect the grade band — far-left in green for A, mid-left in blue for B, mid-right in orange for C, and far-right in red for D.
Two pie charts sit just beneath the percentage:
- Sources Type: Breaks the matched portion of the document down by where the matches came from — typically Internet versus Journal / Publication. The two segment values together add up to the matched portion of the document.
- Report Content: Shows how the document's overall content is distributed, including a Words < 14 segment. This segment represents text that DrillBit identified as matching very short sources (under fourteen words) and excluded automatically. The remainder of the chart represents the original portion of the document.
Exclude Information & Database Selection
These two side-by-side blocks tell you the rules under which the report was generated — what was excluded from matching, and which databases were searched. The animation walks through each rule with a tooltip explaining what it controls:
Exclude Information tells you which exclusion rules were active when the report was generated. Each rule is listed with its current state — Excluded, Not Excluded, or a numeric percentage where applicable:
- Quotes: When excluded, content inside quotation marks is ignored when computing similarity.
- References / Bibliography: When excluded, the references list at the end of the document is removed from the calculation, since citation reuse is expected.
- Source: Excluded < 14 Words: When excluded, any matched span shorter than fourteen words is dropped — this corresponds to the Words < 14 segment in the Report Content chart.
- Excluded Source: The cumulative percentage of similarity removed by manually excluding sources from the analysis report. A value of 0% means no manual exclusions were applied.
- Excluded Phrases: When excluded, custom phrases configured in folder or assignment settings are skipped during matching.
Database Selection records which DrillBit databases were searched when generating the report. Every entry is a Yes / No flag set by the folder or assignment configuration:
- Language: The language profile used to interpret the document. Most reports run in English mode; regional and non-English documents use a matching language profile.
- Student Papers: Whether the institution's repository of past student submissions was checked.
- Journals & publishers: Whether DrillBit's licensed journal and publication corpus was searched.
- Internet or Web: Whether the open web was crawled and matched against the document.
- Institution Repository: Whether the institution's private document repository was included.
QR Code
A unique QR code is generated for every report. Scanning it opens the same PDF report on a mobile device for quick viewing, downloading, or sharing — useful for showing a result during an in-person review without needing to forward the file:
Similarity Report Header & Matched Sources
After the cover, the report restates the result as three large headline figures: the Similarity %, Matched Sources count, and the Grade letter. The percentage and the grade letter are rendered in the colour of the band — green for A, blue for B, orange for C, red for D. To the right sits the four-band legend so anyone reading the printed report can immediately see how the grade was assigned:
Below the headline row is the Matched Sources table. Each row represents one source that DrillBit found content from, and the table is sorted by contribution — the sources contributing the most similarity appear first:
Every row carries four columns:
- Location: A numeric label (1, 2, 3…) that ties the source back to its highlighted occurrences in the document content section. The number is shown inside a coloured chip; the chip colours rotate purely to help you tell adjacent rows apart and carry no other meaning.
- Matched Domain: The internet domain or publication name where the match was found. Internet domains are typically clickable in the on-screen viewer.
- %: How much of the overall document this single source contributed to the similarity figure.
- Source Type: Either Internet Data for crawled web sources or Publication for journal, conference, and book content. The split is the same one visualised in the Sources Type chart in the cover summary.
Highlighted Document Content
The remainder of the report is a faithful render of the original document with every matched span visibly highlighted. Each highlight uses a coloured background that matches the location chip from the matched-sources table, and the location number floats next to the matched text:
This section lets you trace any individual match back to the source it came from: find a highlighted phrase, note the number above or beside it, then look that number up in the matched-sources table to see which domain or publication contributed it. The density of highlighting is itself a quick visual signal — sparse highlights mean a Satisfactory report, while heavy highlighting that dominates the page indicates a Poor or Unacceptable result.
A — Satisfactory (0–10%)
A Satisfactory report is the result you want. Less than ten percent of the document matches anything DrillBit could find, so the document is generally considered original and acceptable without further revision.
- Visual cue: The headline percentage and grade letter A are rendered in green.
- Matched-sources table: Typically a small number of rows (often single digits), each contributing only one or two percent.
- Highlighted document: Sparse, short highlights scattered across the page. The unhighlighted (original) text dominates — the visual confirmation of the green grade.
- Recommended action: No action needed. The document has cleared the originality check.
B — Upgrade (11–40%)
An Upgrade report indicates moderate similarity. Some sections of the document overlap with external sources, and parts may need rephrasing or stronger citation before the document is finalised.
- Visual cue: The headline percentage and grade letter B are rendered in blue.
- Matched-sources table: A larger list — typically a dozen or more sources — with several rows contributing meaningfully (3–8% each).
- Highlighted document: Visible clusters of highlights, often grouped around certain paragraphs or sections rather than spread evenly.
- Recommended action: Review and refine. Use the matched-sources table to identify which paragraphs need rewording or stronger attribution, then resubmit.
C — Poor (41–60%)
A Poor report indicates high similarity. Substantial portions of the document overlap with external sources, and significant rework is required before the document can be considered acceptable.
- Visual cue: The headline percentage and grade letter C are rendered in orange.
- Matched-sources table: Many rows, with the top contributors each accounting for double-digit percentages.
- Highlighted document: Heavy highlighting across most paragraphs — often more highlighted than original text in some sections.
- Recommended action: Significant rework. Identify the dominant matched sources, rewrite affected paragraphs in the author's own words, and add proper citations where references are appropriate.
D — Unacceptable (61–100%)
An Unacceptable report indicates very high similarity. The document lacks originality and a major rewrite is required before it can be reconsidered.
- Visual cue: The headline percentage and grade letter D are rendered in red.
- Matched-sources table: Long list, often with one or two sources contributing very large shares of the total similarity.
- Highlighted document: Dominant highlighting — the original text becomes the minority. Most paragraphs are heavily flagged.
- Recommended action: Major revision. Treat the document as a draft to be substantially rewritten rather than corrected; consult instructors or supervisors before resubmitting.