This Study Guide addresses the topic of essay writing. The essay is used as a form of assessment in many academic disciplines, and is used in both coursework and exams. It is the most common focus for study consultations among students using Learning Development. Other useful guides: What is critical reading?What is critical writing?Thought mapping; Referencing and bibliographies; Avoiding plagiarism; The art of editing. A collection of Question lists is available via the Learning Development website. These lists suggest questions to ask of your writing when you are reviewing it.
This Study Guide addresses the topic of essay writing. The essay is used as a form of assessment in many academic disciplines, and is used in both coursework and exams. It is the most common focus for study consultations among students using Learning Development.
Other useful guides: What is critical reading?What is critical writing?Thought mapping; Referencing and bibliographies; Avoiding plagiarism; The art of editing.
A collection of Question lists is available via the Learning Development website. These lists suggest questions to ask of your writing when you are reviewing it.
To produce a high quality essay you need to demonstrate your ability:
to understand the precise task set by the title;
to identify, appropriate material to read;
to understand and evaluate that material;
to select the most relevant material to refer to in your essay;
to construct an effective argument; and
to arrive at a well-supported conclusion.
The need to use such a wide range of academic skills is probably the main reason why the essay format is so popular with tutors as an assignment.
The word limit adds to the challenge by requiring that all of these skills be demonstrated within a relatively small number of words. Producing incisive and clear written work within a word limit is an important skill in itself, which will be useful in many aspects of life beyond university.
Good, constructively critical feedback can give you excellent guidance on how to improve your essay writing. It is worth attending to all of the suggestions and comments you receive, and trying to act on them.
Common criticism given to students is that their essay:
does not keep to the title that was set;
has a poor structure;
is too descriptive;
does not have enough critical writing.
These criticisms highlight the three basic elements of good essay writing:
attending closely to the title;
establishing a relevant structure that will help you show the development of your argument; and
using critical writing as much as possible; with descriptive writing being used where necessary, but kept to a minimum.
These elements will be used to give a broad overall structure to this Study Guide.
Attending closely to the title
The most important starting point is to listen carefully to what the essay title is telling you.
You need to read every single word of it, and to squeeze out as much guidance you can from the title. Then you need to plan how you will respond to every single element of the title. The guidance given to you by the title is freely available, and is your best clue to what is required in your essay.
As a tutor has said (Creme and Lea, 1997 p41):
‘When my students ask me about essay writing, there are three main pieces of advice that I give them. One, answer the question. Two, answer the question. Three, answer the question.’
This is important at the start, but also throughout your writing, as it can be easy to drift away and waste valuable words from your word limit by writing material that may be interesting, but which is not relevant to the title set.
The Mini Guide: Essay terms explained, and Questions to ask about interpreting essay titles may be useful.
To start you off, and to minimise the likelihood of writer’s block, a useful exercise is to do a ‘brainstorm’ of all your ideas in connection with the essay title. It can be a way of making a lot of progress quite quickly.
It can be stressful and very difficult trying to work out solely in your mind how to tackle an essay title; asking yourself questions such as: What structure should I use? What are my main points? What reading do I need to do? Have I got enough evidence? It can be much less stressful to throw all your thoughts down on paper, before you start trying to find answers to these questions.
In these early stages of your thinking you may not be sure which of your ideas you want to follow up and which you will be discarding. So, don’t feel you have to make that decision in your head before you write anything. Instead, you can catch all of your ideas, in no particular order, on a sheet or two of A4. Once they are down there it will be easier for you to start to review them critically and to see where you need to focus your reading and note taking.
Breaking it down then building it up
Essentially, this is what you are doing within the essay process: breaking ideas down, then building them up again. You need to:
- break down the essay title into its component parts, and consider possible ways of addressing them;
- work with these component parts, as you select your reading and make relevant notes;
- build up the essay using the material you have collected; ordering it;
- presenting and discussing it;
- and forming it into a coherent argument.
Throughout this process, the essay title is the single immovable feature. You begin there; you end there; and everything in between needs to be placed in relation to that title.
All three of the processes described above will inform your decisions about what you need to read for a particular essay. If left unplanned, the reading stage can swallow up huge amounts of time. Fortunately, there is scope for developing efficiency in several ways:
- making intelligent decisions, based on your initial planning, about which sources to target, so you don’t spend time reading less relevant, or even completely irrelevant material;
- reading with a purpose, so that you are looking out for particularly relevant material, rather than paying equal attention to material that is less relevant;
- systematic note taking, so that you record the most relevant material, and that you have full reference details (including page numbers of direct quotes) of all material you may end up using.
While a certain level of efficiency is desirable, it is also important to remain flexible enough to identify relevant and interesting ideas that you had not anticipated.
Writing as thinking
You can use the writing process to help you think through, clarify and develop your early ideas about how you might respond to the title that has been set:
‘you may not know what you think until you have written it down’ (Creme & Lea, 1997 p115).
As with teaching, it is often not until you try to communicate an argument and its evidence that you find where the gaps are in your knowledge or argument. So don’t be afraid of writing down your ideas before they are fully formed, or in the ‘right’ order.
Writing is an active and constructive process; it is not merely a neutral recording of your thoughts. It is therefore useful to go into the writing process expecting to make revisions. The first words you write do not have to be part of the final version. Editing your writing as you develop your ideas is a positive not a negative process: the more you cross out, re-write, and re-order, the better your essay should become.
Establishing a relevant structure to support your argument
All essays need structure. The structure may be strong and clear, or it may be unobtrusive and minimal but, in a good essay, it will be there.
Underpinning the structure will be the ‘argument’ your essay is making. Again this may be strong and obvious, or it may be almost invisible, but it needs to be there. In different subject areas, and with different styles of writing, the term ‘argument’ may seem more or less relevant. However, even in those essays that appear to be highly creative, unscientific, or personal, an argument of some kind is being made.
It is the argument, and how you decide to present and back up your argument, that will influence your decision on how to structure your essay.
The essay structure is not an end in itself, but a means to an end: the end is the quality of the argument.
By creating a relevant structure, you make it much easier for yourself to present an effective argument. There are several generic structures that can help you start to think about your essay structure e.g.:
- by context;
These can be useful starting points, but you will probably decide to work with a more complicated structure e.g.:
- overall chronological structure; broken down by comparisons according to the elements of the title;
- overall thematic structure; broken down by sub-themes;
- overall comparative structure; broken down by context.
In addition to these macro-structures you will probably need to establish a micro-structure relating to the particular elements you need to focus on e.g.: evidence / policy / theory / practice / case studies / examples / debates.
You may feel that, for your particular essay, structures like these feel too rigid. You may wish to create a more flexible or fluid structure. Perhaps a more suitable word than ‘structure’ in those cases may be ‘pattern’, or ‘impression’, or ‘atmosphere’; although these merge into the field of creative writing rather than essay writing.
An analogy could be that of symphony writing. The composers Haydn and Mozart, working in the 18th century, tended to write symphonies to fit reliably and closely within what was called ‘symphonic form’. This set out a pattern for the numbers of movements within the symphony, and for the general structure of writing within each movement. The continued popularity of their work today shows that they clearly managed to achieve plenty of interest and variety within that basic structure.
Later composers moved away from strict symphonic form. Some retained a loose link to it while others abandoned it completely, in favour of more fluid patterns. It would be rare, however, to find a symphony that was without structure or pattern of any kind; it would probably not be satisfactory either to play or to listen to. Similarly, a structure of some kind is probably essential for every essay, however revolutionary.
Your decisions on structure will be based on a combination of:
- the requirements of your department;
- the potential of the essay title; and
- your own preferences and skills.
An iterative, not necessarily a linear process
The process of essay planning and writing does not need to be a linear process, where each stage is done only once. It is often an iterative process i.e.: a process where earlier stages are repeated when they can be revised in the light of subsequent work. A possible iterative process is:
- analyse the title
- brainstorm relevant ideas
- read around the title, making relevant notes
- prepare a first draft
- analyse the title again
- critically review your first draft in the light of this further analysis
- read further to fill in gaps
- prepare final draft
- critically edit the final draft
- submit the finished essay.
‘Helping your readers’
This section heading is in quotes as it is also the heading of chapter 8, pages 80-92, in Barass (1982). Barass (1982 p80) makes the simple but valid statement, that:
‘By making things easy for your readers, you help yourself to convey information and ideas.’
The tutors reading and marking your essays deserve your consideration. They will be reading and marking many, many student essays. If you make your argument hard to follow, so that they need to re-read a paragraph (or more) to try to make sense of what you have written, you will cause irritation, and make their job slower. Realistically, it is possible that they may even decide not to make that effort. It is your task to present your argument in a way that your audience can follow; it is not your audience’s job to launch an investigation to detect the points you are trying to make.
Your tutors will not necessarily be looking for the perfect, revolutionary, unique, special essay; they would be very happy to read a reasonably well-planned, well-argued and well-written essay. They will not want to pull your essay to pieces. They would much rather enjoy reading it, and be satisfied by the thread of your argument. In the words of a tutor:
‘I’m looking for focus, for a voice that I feel confident with and not bored by – someone who knows the area and is going to take me round the issues in an objective, informed and interesting way.’ Stott (2001 p 37)
A powerful introduction is invaluable. It can engage your readers, and can give them confidence that you have thought carefully about the title, and about how you are going to address it. A useful generic structure is to:
- begin with a general point about the central issue;
- show your understanding of the task that has been set;
- show how you plan to address the title in your essay structure;
- make a link to the first point.
It may be possible to use only one paragraph for your introduction, but it may fall more easily into two or more. You will need to adapt and extend this basic structure to fit with your own discipline and the precise task set. Here is an example of an introduction for an essay entitled:
Examine and compare the nature and development of the tragic figures of Macbeth and Dr Faustus in their respective plays.
- Begin with a general point
Dr Faustus and Macbeth are both plays that show their respective playwrights at the pinnacle of their careers.
- Show your understanding of the task set
When comparing the nature of the two plays’ respective heroes, both parallels and contrasts can be found.
- Show how you plan to address the title
In the first section of this essay, the role of the tragic hero will be considered … The second section of the essay will examine the nature … Finally, a comparison will be made of the development of the two …
- Make a link to the first point
In examining the characters’ tragic qualities, a useful starting point is Aristotle’s definition of tragedy…
Although the introduction appears at the beginning of your essay, you may prefer to write it towards the end of the drafting process:
‘It is only when you have completed a piece of writing that you can introduce it to the reader.’ (Crème & Lea, 1997 p115)
Questions to ask of your introduction and conclusion may be useful.
The heart of the essay
The middle part of the essay must fulfil the promises made in your introduction, and must support your final conclusions. Failure to meet either or both of these requirements will irritate your reader, and will demonstrate a lack of self-critique and of editing.
The central part of your essay is where the structure needs to do its work, however explicit or implicit your chosen structure may be. The structure you choose needs to be one that will be most helpful to you in addressing the essay title.
The content of this central part will probably contain: ideas; explanations; evidence; relevant referencing; and relevant examples. It will be characterised by:
- appropriate academic style;
- interesting and engaging writing;
- clarity of thought and expression,
- sensible ordering of material, to support and the development of ideas and the development of argument.
Questions to ask of your essay content may be useful.
A powerful conclusion is a valuable tool. The aim is to leave your reader feeling that you have done a good job. A generic structure that you may find useful is:
- brief recap of what you have covered in relation to the essay title;
- reference to the larger issue;
- evaluation of the main arguments;
- highlighting the most important aspects.
The example below relates to the essay title used on the previous page.
- Brief recap
The characters of Macbeth and Faustus are very similar in many respects; for example they both willingly follow a path that leads to their damnation. …
- Reference to the larger issue
The differences lie in the development of the characters in what are essentially two different types of plays.
- Evaluation of the main arguments
As has been shown, the character of Macbeth has a nadir from which he ascends at the conclusion of the play. This is in keeping with Aristotle’s definition of tragedy. For Faustus however, there is no such ascension. This fits with the style of the morality play: the erring Faustus must be seen to be humbled at his end for the morality to be effective…
- Highlighting the most important aspects
It is this strong element of morality in Dr Faustus that ultimately divides the two leading characters.
Questions to ask of your introduction and conclusion may be useful.
Being a critical writer
After attending closely to the title; and establishing a useful structure; a third main element in the essay-writing process is the confident use of ‘critical writing’. The study guide What is critical writing? provides more extensive guidance in this area, but it is useful to present one section from that guide below:
The most characteristic features of critical writing are:
- a clear and confident refusal to accept the conclusions of other writers without evaluating the arguments and evidence that they provide;
- a balanced presentation of reasons why the conclusions of other writers may be accepted or may need to be treated with caution;
- a clear presentation of your own evidence and argument, leading to your conclusion; and
- a recognition of the limitations in your own evidence, argument, and conclusion.
With critical writing, you are doing work with the evidence you are using, by adding a level of examination and evaluation. Stott (2001 p37) proposes that, ‘Knowledge-telling is the regurgitation of knowledge in an essay. But knowledge-transfer is what’s crucial: the ability to manipulate that basic, raw material in order to make a convincing argument’. Questions to ask about your level of critical writing may be useful.
One way to practise critical writing is to make sure that you don’t leave any description to speak for itself, if it is part of your evidence and argument. If a quote or piece of data is worth including, then it’s also worth explaining why you’ve included it: ‘Do not leave your reader to work out the implications of any statement.’ (Barass 1982 p80).
Another useful tool to support critical writing is the paragraph! Aim to present one idea per paragraph. Within the paragraph you could:
- introduce the idea/piece of evidence/quote/stage of argument;
- present the idea/piece of evidence/quote/stage of argument;
- comment on it – this is where you demonstrate your critical thinking and writing.
A different pattern would be to use a paragraph to present and describe an idea/piece of evidence/quote/stage of argument, then to use the subsequent paragraph to explain its relevance.
Finally, you need to take a break from your essay so that you can return to it with fresh eyes for the final editing.
'Editing and proof reading are not the icing on the cake, as some people think. They are absolutely crucial because it is only at this stage that the student can see that the argument hangs together, has a sequence and is well-expressed. Editing is both difficult and important.’ (Stott, 2001 p39)
Yes, editing is important, but no it does not need to be difficult. You’ve done most of the hard work already in the reading, evaluating, and writing. Also, criticising your writing tends to be easier than creating it in the first place. The study guide: The art of editing and the sheet: Questions to ask when editing may be useful.
A tutor can learn a worrying amount about the quality of your essay simply from how it looks on the page. The lengths of paragraphs; the lengths of sentences; the neatness of the reference list; the balance of length between different sections; all offer insight into the kind of essay they are about to read.
In general, think ‘short and straightforward’. Shorter words are often preferable to longer words, unless there is some specific vocabulary that you need to include to demonstrate your skill. Short to middle length sentences are almost always preferable to longer ones. And over-long paragraphs tend to demonstrate that you are not clear about the specific points you are making. Of course, these are general points, and there may be some occasions, or some subject areas, where long paragraphs are appropriate.
Accurate grammar and spelling are important. Consistently poor grammar or spelling can give the impression of lack of care, and lack of clarity of thought. Careless use of commas can actually change the meaning of a sentence. And inaccurate spelling and poor grammar can make for very irritating reading for the person marking it. The previous sentence began with ‘And’. This practice is now widely accepted where it makes good sense. It is however possible that some tutors may still prefer not to see it.
Summary of key points
The title is the most important guidance you have. The task ahead is nothing more and nothing less than is stated in the title. When in doubt about any aspect of your reading for the essay, or about your writing, the first step is to go back and consult the essay title. This can be surprisingly helpful. It informs directly: the choice of reading; the structure you choose for the essay; which material to include and exclude; what to do with the material you use; and how to introduce and conclude.
A relevant and useful structure to support the presentation of your response to the title is vital.
Expect to undertake an iterative process of planning, reading, drafting, reviewing, planning, reading, re-drafting, and editing.
Editing is a crucial part of the process not an optional extra.
Barass R, (1982) Students must write: a guide to better writing in coursework and examinations. London: Methuen.
Creme P & Lea MR (1997) Writing at university: a guide for students. Buckingham: Open University Press.
Stott R, (2001) The essay writing process. Chapter 3 pp36-58. In Making your case: a practical guide to essay writing. Eds. Stott R, Snaith A, & Rylance R. Harlow: Pearson Education Limited.
Questions to ask of your reference list may be useful when reviewing your own reference list.
Optical character recognition (also optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast). It is widely used as a form of information entry from printed paper data records, whether passport documents, invoices, bank statements, computerised receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitising printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems capable of producing a high degree of recognition accuracy for most fonts are now common, and with support for a variety of digital image file format inputs. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.
See also: Timeline of optical character recognition
Early optical character recognition may be traced to technologies involving telegraphy and creating reading devices for the blind. In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. Concurrently, Edmund Fournier d'Albe developed the Optophone, a handheld scanner that when moved across a printed page, produced tones that corresponded to specific letters or characters.
In the late 1920s and into the 1930s Emanuel Goldberg developed what he called a "Statistical Machine" for searching microfilm archives using an optical code recognition system. In 1931 he was granted USA Patent number 1,838,389 for the invention. The patent was acquired by IBM.
With the advent of smart-phones and smartglasses, OCR can be used in internet connected mobile device applications that extract text captured using the device's camera. These devices that do not have OCR functionality built into the operating system will typically use an OCR API to extract the text from the image file captured and provided by the device. The OCR API returns the extracted text, along with information about the location of the detected text in the original image back to the device app for further processing (such as text-to-speech) or display.
Blind and visually impaired users
In 1974, Ray Kurzweil started the company Kurzweil Computer Products, Inc. and continued development of omni-font OCR, which could recognise text printed in virtually any font (Kurzweil is often credited with inventing omni-font OCR, but it was in use by companies, including CompuScan, in the late 1960s and 1970s). Kurzweil decided that the best application of this technology would be to create a reading machine for the blind, which would allow blind people to have a computer read text to them out loud. This device required the invention of two enabling technologies – the CCDflatbed scanner and the text-to-speech synthesiser. On January 13, 1976, the successful finished product was unveiled during a widely reported news conference headed by Kurzweil and the leaders of the National Federation of the Blind. In 1978, Kurzweil Computer Products began selling a commercial version of the optical character recognition computer program. LexisNexis was one of the first customers, and bought the program to upload legal paper and news documents onto its nascent online databases. Two years later, Kurzweil sold his company to Xerox, which had an interest in further commercialising paper-to-computer text conversion. Xerox eventually spun it off as Scansoft, which merged with Nuance Communications. The research group headed by A. G. Ramakrishnan at the Medical intelligence and language engineering lab, Indian Institute of Science, has developed PrintToBraille tool, an open source GUI frontend that can be used by any OCR to convert scanned images of printed books to Braille books.
In the 2000s, OCR was made available online as a service (WebOCR), in a cloud computing environment, and in mobile applications like real-time translation of foreign-language signs on a smartphone.
Various commercial and open source OCR systems are available for most common writing systems, including Latin, Cyrillic, Arabic, Hebrew, Indic, Bengali (Bangla), Devanagari, Tamil, Chinese, Japanese, and Korean characters.
OCR engines have been developed into many kinds of domain-specific OCR applications, such as receipt OCR, invoice OCR, check OCR, legal billing document OCR.
They can be used for:
- Data entry for business documents, e.g. check, passport, invoice, bank statement and receipt
- Automatic number plate recognition
- Automatic insurance documents key information extraction
- Extracting business card information into a contact list
- More quickly make textual versions of printed documents, e.g. book scanning for Project Gutenberg
- Make electronic images of printed documents searchable, e.g. Google Books
- Converting handwriting in real time to control a computer (pen computing)
- Defeating CAPTCHA anti-bot systems, though these are specifically designed to prevent OCR. The purpose can also be to test the robustness of CAPTCHA anti-bot systems.
- Assistive technology for blind and visually impaired users
OCR is generally an "offline" process, which analyses a static document. Handwriting movement analysis can be used as input to handwriting recognition. Instead of merely using the shapes of glyphs and words, this technique is able to capture motions, such as the order in which segments are drawn, the direction, and the pattern of putting the pen down and lifting it. This additional information can make the end-to-end process more accurate. This technology is also known as "on-line character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition".
OCR software often "pre-processes" images to improve the chances of successful recognition. Techniques include:
- De-skew – If the document was not aligned properly when scanned, it may need to be tilted a few degrees clockwise or counterclockwise in order to make lines of text perfectly horizontal or vertical.
- Despeckle – remove positive and negative spots, smoothing edges
- Binarisation – Convert an image from color or greyscale to black-and-white (called a "binary image" because there are two colours). The task of binarisation is performed as a simple way of separating the text (or any other desired image component) from the background. The task of binarisation itself is necessary since most commercial recognition algorithms work only on binary images since it proves to be simpler to do so. In addition, the effectiveness of the binarisation step influences to a significant extent the quality of the character recognition stage and the careful decisions are made in the choice of the binarisation employed for a given input image type; since the quality of the binarisation method employed to obtain the binary result depends on the type of the input image (scanned document, scene text image, historical degraded document etc.).
- Line removal – Cleans up non-glyph boxes and lines
- Layout analysis or "zoning" – Identifies columns, paragraphs, captions, etc. as distinct blocks. Especially important in multi-column layouts and tables.
- Line and word detection – Establishes baseline for word and character shapes, separates words if necessary.
- Script recognition – In multilingual documents, the script may change at the level of the words and hence, identification of the script is necessary, before the right OCR can be invoked to handle the specific script.
- Character isolation or "segmentation" – For per-character OCR, multiple characters that are connected due to image artifacts must be separated; single characters that are broken into multiple pieces due to artifacts must be connected.
- Normalise aspect ratio and scale
Segmentation of fixed-pitch fonts is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. For proportional fonts, more sophisticated techniques are needed because whitespace between letters can sometimes be greater than that between words, and vertical lines can intersect more than one character.
There are two basic types of core OCR algorithm, which may produce a ranked list of candidate characters.
Matrix matching involves comparing an image to a stored glyph on a pixel-by-pixel basis; it is also known as "pattern matching", "pattern recognition", or "image correlation". This relies on the input glyph being correctly isolated from the rest of the image, and on the stored glyph being in a similar font and at the same scale. This technique works best with typewritten text and does not work well when new fonts are encountered. This is the technique the early physical photocell-based OCR implemented, rather directly.
Feature extraction decomposes glyphs into "features" like lines, closed loops, line direction, and line intersections. The extraction features reduces the dimensionality of the representation and makes the recognition process computationally efficient. These features are compared with an abstract vector-like representation of a character, which might reduce to one or more glyph prototypes. General techniques of feature detection in computer vision are applicable to this type of OCR, which is commonly seen in "intelligent" handwriting recognition and indeed most modern OCR software.Nearest neighbour classifiers such as the k-nearest neighbors algorithm are used to compare image features with stored glyph features and choose the nearest match.
Software such as Cuneiform and Tesseract use a two-pass approach to character recognition. The second pass is known as "adaptive recognition" and uses the letter shapes recognised with high confidence on the first pass to recognise better the remaining letters on the second pass. This is advantageous for unusual fonts or low-quality scans where the font is distorted (e.g. blurred or faded).
The OCR result can be stored in the standardised ALTO format, a dedicated XML schema maintained by the United States Library of Congress.
For a list of optical character recognition software see Comparison of optical character recognition software.
OCR accuracy can be increased if the output is constrained by a lexicon – a list of words that are allowed to occur in a document. This might be, for example, all the words in the English language, or a more technical lexicon for a specific field. This technique can be problematic if the document contains words not in the lexicon, like proper nouns. Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy.
The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the original layout of the page and produce, for example, an annotated PDF that includes both the original image of the page and a searchable textual representation.
"Near-neighbor analysis" can make use of co-occurrence frequencies to correct errors, by noting that certain words are often seen together. For example, "Washington, D.C." is generally far more common in English than "Washington DOC".
Knowledge of the grammar of the language being scanned can also help determine if a word is likely to be a verb or a noun, for example, allowing greater accuracy.
The Levenshtein Distance algorithm has also been used in OCR post-processing to further optimize results from an OCR API.
In recent years,[when?] the major OCR technology providers began to tweak OCR systems to better deal with specific types of input. Beyond an application-specific lexicon, better performance can be had by taking into account business rules, standard expression,[clarification needed] or rich information contained in color images. This strategy is called "Application-Oriented OCR" or "Customised OCR", and has been applied to OCR of license plates, invoices, screenshots, ID cards, driver licenses, and automobile manufacturing.
There are several techniques for solving the problem of character recognition by means other than improved OCR algorithms.
Forcing better input
Special fonts like OCR-A, OCR-B, or MICR fonts, with precisely specified sizing, spacing, and distinctive character shapes, allow a higher accuracy rate during transcription. These were often used in early matrix-matching systems.
"Comb fields" are pre-printed boxes that encourage humans to write more legibly – one glyph per box. These are often printed in a "dropout color" which can be easily removed by the OCR system.
Palm OS used a special set of glyphs, known as "Graffiti" which are similar to printed English characters but simplified or modified for easier recognition on the platform's computationally limited hardware. Users would need to learn how to write these special glyphs.
Zone-based OCR restricts the image to a specific part of a document. This is often referred to as "Template OCR".
Crowdsourcing humans to perform the character recognition can quickly process images like computer-driven OCR, but with higher accuracy for recognising images than is obtained with computers. Practical systems include the Amazon Mechanical Turk and reCAPTCHA. The National Library of Finland has developed an online interface for users correct OCRed texts in the standardised ALTO format. Crowdsourcing has also been used not to perform character recognition directly but to invite software developers to develop image processing algorithms, for example, through the use of rank-order tournaments.
This article needs to be updated. Please update this article to reflect recent events or newly available information.(March 2013)
Commissioned by the U.S. Department of Energy (DOE), the Information Science Research Institute (ISRI) had the mission to foster the improvement of automated technologies for understanding machine printed documents, and it conducted the most authoritative of the Annual Test of OCR Accuracy from 1992 to 1996.
Recognition of Latin-script, typewritten text is still not 100% accurate even where clear imaging is available. One study based on recognition of 19th- and early 20th-century newspaper pages concluded that character-by-character OCR accuracy for commercial OCR software varied from 81% to 99%; total accuracy can be achieved by human review or Data Dictionary Authentication. Other areas—including recognition of hand printing, cursive handwriting, and printed text in other scripts (especially those East Asian language characters which have many strokes for a single character)—are still the subject of active research. The MNIST database is commonly used for testing systems' ability to recognise handwritten digits.
Accuracy rates can be measured in several ways, and how they are measured can greatly affect the reported accuracy rate. For example, if word context (basically a lexicon of words) is not used to correct software finding non-existent words, a character error rate of 1% (99% accuracy) may result in an error rate of 5% (95% accuracy) or worse if the measurement is based on whether each whole word was recognised with no incorrect letters.
Web-based OCR systems for recognising hand-printed text on the fly have become well known as commercial products in recent years[when?] (see Tablet PC history). Accuracy rates of 80% to 90% on neat, clean hand-printed characters can be achieved by pen computing software, but that accuracy rate still translates to dozens of errors per page, making the technology useful only in very limited applications.
Recognition of cursive text is an active area of research, with recognition rates even lower than that of hand-printed text. Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. For example, recognising entire words from a dictionary is easier than trying to parse individual characters from script. Reading the Amount line of a cheque (which is always a written-out number) is an example where using a smaller dictionary can increase recognition rates greatly. The shapes of individual cursive characters themselves simply do not contain enough information to accurately (greater than 98%) recognise all handwritten cursive script.
Main article: Optical Character Recognition (Unicode block)
Characters to support OCR were added to the Unicode Standard in June 1993, with the release of version 1.1.
Some of these characters are mapped from fonts specific to MICR, OCR-A or OCR-B.
- ^OnDemand, HPE Haven. "OCR Document".
- ^OnDemand, HPE Haven. "undefined".
- ^ abSchantz, Herbert F. (1982). The history of OCR, optical character recognition. [Manchester Center, Vt.]: Recognition Technologies Users Association. ISBN 9780943072012.
- ^d'Albe, E. E. F. (1 July 1914). "On a Type-Reading Optophone". Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences. 90 (619): 373–375. doi:10.1098/rspa.1914.0061.
- ^"Extracting text from images using OCR on Android". 27 June 2015.
- ^"[Tutorial] OCR on Google Glass". 23 October 2014.
- ^"The History of OCR". Data processing magazine. 12: 46. 1970.
- ^PrintToBraille Tool. "ocr-gui-frontend". MILE Lab, Dept of EE, IISc. Archived from the original on December 25, 2014. Retrieved 7 December 2014.
- ^"How To Crack Captchas". andrewt.net. 2006-06-28. Retrieved 2013-06-16.
- ^"Breaking a Visual CAPTCHA". Cs.sfu.ca. 2002-12-10. Retrieved 2013-06-16.
- ^Tappert, C. C.; Suen, C. Y.; Wakahara, T. (1990). "The state of the art in online handwriting recognition". IEEE Transactions on Pattern Analysis and Machine Intelligence. 12 (8): 787. doi:10.1109/34.57669.
- ^ ab"Optical Character Recognition (OCR) – How it works". Nicomsoft.com. Retrieved 2013-06-16.
- ^Sezgin, Mehmet; Sankur, Bulent (2004). "Survey over image thresholding techniques and quantitative performance evaluation"(PDF). Journal of Electronic imaging. 13 (1): 146. Bibcode:2004JEI....13..146S. doi:10.1117/1.1631315. Retrieved 2 May 2015.
- ^Gupta, Maya R.; Jacobson, Nathaniel P.; Garcia, Eric K. (2007). "OCR binarisation and image pre-processing for searching historical documents"(PDF). Pattern Recognition. 40 (2): 389. doi:10.1016/j.patcog.2006.04.043. Retrieved 2 May 2015.
- ^Trier, Oeivind Due; Jain, Anil K. (1995). "Goal-directed evaluation of binarisation methods"(PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 17 (12): 1191–1201. doi:10.1109/34.476511. Retrieved 2 May 2015.
- ^Milyaev, Sergey; Barinova, Olga; Novikova, Tatiana; Kohli, Pushmeet; Lempitsky, Victor (2013). "Image binarisation for end-to-end text understanding in natural images"(PDF). Document Analysis and Recognition (ICDAR) 2013. 12th International Conference on. Retrieved 2 May 2015.
- ^Pati, P.B.; Ramakrishnan, A.G. (1987-05-29). Word Level Multi-script Identification. Pattern Recognition Letters, Vol. 29, pp. 1218 - 1229, 2008. doi:10.1016/j.patrec.2008.01.027.
- ^"Basic OCR in OpenCV | Damiles". Blog.damiles.com. Retrieved 2013-06-16.
- ^ abcRay Smith (2007). "An Overview of the Tesseract OCR Engine"(PDF). Retrieved 2013-05-23.
- ^"OCR Introduction". Dataid.com. Retrieved 2013-06-16.
- ^"How OCR Software Works". OCRWizard. Retrieved 2013-06-16.
- ^"The basic pattern recognition and classification with openCV | Damiles". Blog.damiles.com. Retrieved 2013-06-16.
- ^ abc"How does OCR document scanning work?". Explain that Stuff. 2012-01-30. Retrieved 2013-06-16.
- ^"How to optimize results from the OCR API when extracting text from an image? - Haven OnDemand Developer Community".
- ^"What is the point of an online interactive OCR text editor? - Fenno-Ugrica".
- ^Riedl, C.; Zanibbi, R.; Hearst, M. A.; Zhu, S.; Menietti, M.; Crusan, J.; Metelsky, I.; Lakhani, K. (20 February 2016). "Detecting Figures and Part Labels in Patents: Competition-Based Development of Image Processing Algorithms". International Journal on Document Analysis and Recognition. 19 (2): 155. doi:10.1007/s10032-016-0260-8.
- ^"Code and Data to evaluate OCR accuracy, originally from UNLV/ISRI". Google Code Archive.
- ^Holley, Rose (April 2009). "How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs". D-Lib Magazine. Retrieved 5 January 2014.
- ^Suen, C.Y.; Plamondon, R.; Tappert, A.; Thomassen, A.; Ward, J.R.; Yamamoto, K. (1987-05-29). Future Challenges in Handwriting and Computer Applications. 3rd International Symposium on Handwriting and Computer Applications, Montreal, May 29, 1987. Retrieved 2008-10-03.