Document Quality Analysis
This guide explains the DrillBit Grammar Checker PDF Report — the downloadable PDF document DrillBit generates after analysing a document for grammar, phrase quality, vocabulary richness, and structural clarity. It is the most parameter-rich of the three DrillBit reports: alongside basic submission metadata it produces character and word counts, reading and execution time estimates, an overall Grammar Quality score, four sub-scores that contribute to that overall, a detailed phrase analysis, an indexed-content table, and a structured list of grammar mistakes with category-by-category explanations.
How to Read a Document Quality Report
The Grammar Checker PDF report has four logical pages, in this order: a Cover Summary with submission metadata, text statistics, and the four headline scores; a Detailed Analysis page that breaks Phrases Quality down into eight individual metrics and lists any duplicate sentences and indexed-content categories; a Submitted Text page rendering the document line by line with grammar markers highlighted inline; and a Grammar Info page detailing every flagged phrase with a category description. Before diving in, here is what the report looks like as you scroll through it:
Cover Summary
The cover summary block packs four pieces of information into one page: who submitted what (Submission Information), how big the document is (Submitted Text), how long it takes to read or speak (Reading and Execution Time), and the headline scores (Result Information).
Submission Information
This block records the identity of the document and the person who submitted it. Every field is captured at the moment of upload and cannot be edited afterwards. Watch the animation below as each field is explained:
- Author Name: The name entered for the document's author at the time of submission.
- Title: The document title as supplied during upload. Appears on every downloaded copy.
- Paper / Submission ID: A unique numeric identifier DrillBit assigns to the submission. Quote it when raising a support ticket.
- Submitted by: The email address of the account that uploaded the document.
- Submission Date: The timestamp the document was received, recorded in
YYYY-MM-DD HH:MM:SSformat. - Document type: The kind of source DrillBit detected — for example Assignment, Article, Thesis, or Synopsis.
Submitted Text and Reading Time
Two compact blocks summarise the size of the document and how long it takes to consume:
Submitted Text reports four counts taken straight from the parsed document:
- Characters: Total characters in the document — including letters, numbers, punctuation, and spaces.
- Words: Total word count, calculated after the document has been tokenised.
- Sentences: The number of sentences DrillBit identified. Useful for spotting unusually long or short sentences when compared with the word count.
- Lines: The number of distinct lines of text. Lines are not the same as sentences — a single sentence can span multiple lines, and a single line can contain multiple sentences.
Reading and Execution Time gives three duration estimates. All three are formatted as X Hr, Y Min:
- Reading: How long the document takes to read silently, calculated from the word count using a standard adult reading-speed assumption.
- Speaking: How long the document takes to read aloud. Always longer than the silent-reading time because spoken delivery is slower.
- Execution: How long DrillBit's grammar checker took to analyse the document. Useful as a sanity check; very long execution times can indicate a very large document or a transient issue.
Result Information — Four sub-scores
The Result Information block is the most important part of the cover. It shows one overall Grammar Quality percentage in red, followed by the four sub-scores that contribute to it. Watch the animation below: the overall score appears first, then each sub-score counts up to its value with an explanation tooltip:
- Grammar Quality (overall): A combined score that reflects writing quality across vocabulary variety, freedom from duplication, balance of indexed vs unindexed content, and absence of grammar mistakes. The "Except Similarity & AI Content" qualifier reminds you that this score is independent of the Similarity and AI Content reports.
- 1. Phrases Quality: A composite of the eight phrase metrics shown on the Detailed Analysis page — the higher the score, the richer and more varied the phrasing.
- 2. Non-Duplicate Content: The percentage of content that is not repeated within the document. The annotation in brackets — e.g. (Duplicate 0.0%) — shows the complement: the percentage that is duplicated.
- 3. Indexed Content: The percentage of the document that DrillBit was able to categorise into known content types (e.g. references, captions, quoted material). A higher percentage typically means the document follows recognisable academic structure.
- 4. Grammar Info: The percentage of the document free of grammar mistakes. The annotation in brackets — e.g. (Mistakes 0, Suggestion 1) — shows the raw counts of detected mistakes and suggested improvements.
Detailed Analysis
The next section breaks down the headline scores into individual metrics. Three sub-blocks make up this section: the eight phrase-quality metrics, the duplicate-content listing, and the indexed-content table.
Phrases Quality breakdown
Eight metrics together describe how the document uses words and sentences. The first four classify each word's character composition; the last four describe vocabulary richness and sentence structure:
- Only Alphabets: Percentage of words composed entirely of letters — ordinary prose. The bulk of a typical academic document falls in this bucket.
- Only Numbers: Percentage of words composed entirely of digits. Useful as a quick check that a document isn't unexpectedly heavy on raw numeric content.
- Alpha-numeric: Percentage of words mixing letters and digits — e.g. product codes, version labels, citation references like section 4a.
- Words with Special Characters: Percentage of words containing punctuation, hyphens, or other non-alphanumeric characters — including hyphenated terms like multi-faceted.
- Unique Words: Percentage of distinct terms used exactly once in the document. A high figure indicates a varied vocabulary; a low figure indicates many repeated words.
- Rare Words: Percentage of words that are not among the 5,000 most common English words. A higher figure suggests advanced vocabulary; an unusually high figure may signal artificially elevated wording.
- Common Words: Percentage of words that are among the 1,000 most common English words. Reasonable proportions are expected; very high figures suggest simplistic phrasing.
- Word Length: Average characters per word. Typical English prose averages around 4–6 characters; markedly higher numbers point to dense, technical writing.
- Sentence Length: Average words per sentence. Lower numbers indicate short, direct sentences; higher numbers indicate complex, multi-clause structures.
Non-Duplicate Content
Right after the phrase breakdown, the report lists any duplicate content it found. Duplicates are reported in two categories:
- Duplicate Sentences: Whole sentences that appear more than once in the document. If none are found, the section reads --NIL--.
- Duplicate Sub-Strings: Phrases of several consecutive words that repeat. Used to catch near-duplicates that aren't full sentences but still reduce vocabulary variety.
The percentage shown for Non-Duplicate Content on the cover is calculated as 100% − (duplicates as a share of total content). A high score means the document is largely original (in the within-document sense), not that it is original relative to outside sources — that's what the Similarity report is for.
Indexed Content
The Indexed Content table shows how DrillBit categorised every line in the document. Each row is one category, with line and word counts and a percentage:
- Sl. No: The serial number of the row, useful when referring to a specific category in feedback.
- Index: The category DrillBit assigned. Common values include Other Data (general body text), References, Quotes, and similar structural categories.
- Lines: How many lines of the document fall in this category.
- Words: How many words fall in this category.
- % in Report: The category's share of the total document. The view link in the on-screen viewer opens just those lines for inspection.
Submitted Text
After the Detailed Analysis, the report renders the entire document line-by-line with line numbers (Line 1|, Line 2|, ...). The text is exactly as DrillBit parsed it — useful both as a reference copy and so that any grammar marker mentioned later (e.g. "Line 12") points back to the correct location. Where the grammar checker has flagged a phrase, the trigger words are underlined inline within the rendered text.
Grammar Info — Detailed mistake list
The final section of the report lists every grammar mistake or suggestion DrillBit detected, one card per finding. Each card carries the exact phrase under review, the grammar category it belongs to, the recommended replacement, and a worked example. Below each card is a Category Description block that explains the grammar concept in plain language so the suggestion is actionable even if the reader isn't a grammar specialist:
- Phrase & line: The exact text DrillBit flagged and the line number where it appears in the Submitted Text page above. Use the line number to find the surrounding context in the original document.
- Category: The grammar concept the issue belongs to (Prepositions, Subject-Verb Agreement, Article Usage, and so on). Identical to the heading in the Category Description block underneath.
- Suggestion: DrillBit's recommended fix. The struck-through text is the original; the highlighted text is the proposed replacement.
- Example: A link that, in the on-screen viewer, opens a worked example of the issue. In the printed PDF this remains as a view placeholder.
- Category Description: A plain-language definition of the grammar category, plus an Incorrect / Correct sentence pair illustrating the rule. Provided so the suggestion is actionable without consulting a separate grammar reference.