Contact

We can help you turn your idea into reality, take over your existing project, or extend your current development team.

Tell us about yourself

By filling in the form, you agree to our Privacy Policy, including our cookie use.

What can we help you with?
Tell us about the project

We will have a call with you anyway, this is just for us to soft onboard ourself to topic

GET IN TOUCH
Address

Velisava Vulovica 18
11000 Beograd

Get Directions
SUPPORT

Purchased on of our plugins?

Reach out to support

Premier development hub for planning, building, support and enhancement of top-notch web applications.

Get in touch

Velisava Vulovica 18
11000 Beograd

Find us

S3 - OCR - Serverless OCR Text Recognition

S3-OCR is an application capable of OCR processing of the given PDF file in the fastest way possible.

Services
  • Project analysis and planning
  • Project management
  • Development
  • QA and maintenance
Technology
  • Amazon Web Services (AWS)
  • Serverless framework
  • NodeJS
  • Webpack JS
  • tesseract-ocr
  • Poppler
Client
  • VirtualPostMail
Visit

Overview

PDF-OCR is an application that performs OCR processing of the given PDF file in the fastest way possible. VirtualPostMail was using the old application in for this purpose, but the application wasn't able to process large PDFs, it wasn't compatible with all the servers available, and frequently it needed 2 to 3 days to process the request, or even wasn't able to finish it. Our partners from VirtualPostMail relied on us to make a new application for them with an improved performance to assure their customers have a fast and pleasant experience with their services.

Kickoff

Our job was to make a new application from scratch. The new app is designed to work on the Amazon Web Services (AWS) platform using the Serverless framework. Large PDFs are now split between different servers and pages are processed in parallel, resulting in significantly lower processing time, while the costs remain the same. We leveraged AWS cloud solutions to build a sophisticated application with increased flexibility, scalability and reliability. PDF-OCR is integrated with other systems so that the entire process is done automatically.

Timeline

After the agreement with the client about purpose and requirements of the application we started by making an architecture diagram of the app which helps system designers and developers visualize the high-level, overall structure to ensure that the system meets our goals. We decided to use AWS Step Functions that lets you coordinate multiple AWS services into serverless workflows so you can build and update apps quickly. Using Step Functions, we made our workflows run fast using AWS Lambda for this feature-rich application. Properly set up Step Functions automatically trigger and track each step, and retries when there are errors, so your application executes in order and as expected.

Key Results

  • OCR Application that runs AWS services using Serverless framework
  • Fast application able to process a large amount of data in a short period of time
  • Highly usable custom software for VirtualPostMail