|
Post by Devonator on Dec 5, 2012 4:18:39 GMT -5
Project Members
Logan Buchy Martin Pajchel
Brief
Abstract about the feature, including something about significance. What should it do? How should we do it? What is the deliverable? Significance
Broader context about why this feature is important to add to the robot Milestones
Optical Character Recognition Page Segmentation ?
Technicals Image Processing
Filter descriptions and implementations may go here
|
|
|
Post by martin on Dec 5, 2012 20:39:09 GMT -5
Information: - Project completed as part of EECE 466 DSP - Program segments and reads scanned documents in Times New Roman - Problems if to be integrated with robot:
|
|
|
Post by martin on Dec 5, 2012 20:40:47 GMT -5
(cont) - skew - perspective - lens eye effect from camera
- Me and Logan are probably going to work on other things that are more core functions the robot needs.
|
|
|
Post by Devonator on Dec 6, 2012 20:51:53 GMT -5
Here are the logs as taken from trac. As per the post above I believe this project is being discontinued for now?
Log
10/09 - MARTIN:
current training script I have (enclosed) most literature say use multi layered NN with input notes # of pixels got it to converge pretty close with 30000 epochs am trying to put a max of 100000 epochs and train til convergence found more links that relate how to train these let me know if you want these some StackOverflow? links http://stackoverflow.com/questions/9092821/python-neurolab-feed-forward-neural-network, http://stackoverflow.com/questions/12404128/neural-network-to-train-a-image-so-as-to-get-its-unicode-as-output-python lib book referenced an article that talks about splitting up the NN for classes of letter we could use vert projections to class in: [ a, o, m , n ] [ b, t, l ... ] [ y, j, q, ... ] lower number of letters make it easier to train NN (less cumputing time, prob more accurate as well) tesseract OCR: http://code.google.com/p/tesseract-ocr/ open source ocr for documents got it installed written in C++, runs in Visual Studio 2008 environment hope we can use for the car recog part if needed haven't played around with it yet
10/15 LOGAN:
Referenced paper: Optical Character Recognition by a Neural Network - MICHAEL SABOURIN suggests that this may be more difficult that just attaching the input neurons directly to the pixels Martin and I have confirmed that for a larger set of letters ( more than 6ish ) the NN will not accurately recognise any letters The paper suggests finding the contours of letters and deriving a 'tangent field' from these contours
Quote from paper...
In this classification system, the primary feature is the shape of the object, as represented by the tangent field of its contour. The tangent field is derived by smoothing the chain code description and then uni- formly sampling the contour to 64 points. Smoothing reduces noise influences, and uniform sampling makes this feature scale invariant. The angle between adjacent samples of the smoothed contour are encoded as a vec- tor, which forms the input to the neural network.
Have implemented some code to find the 'tangent field' for letters. The current state of the code requires large high-resolution letters to generate the tangent fields. This paper is quite extensive on how they approached OCR. They have many layer of networks to recognise characters that are similar to each other They also use the genus of the object (number of contours within the letter is how I interpreted this) For the neural network, we need to create a larger data set to train with. We need to generate many samples of 'A' (rotated? Added noise?). The paper used 200 per letter, but they also can identify many typefaces.
|
|