Build a Web Page to Scan Documents to PDF in Just 5 Minutes

Scan Documents to PDF in 5 Minutes If you are developing a web application that will require the capability to deal with different digital file formats, chances are PDF will be a must-have file format.

PDF is short for Portable Document Format. Long ago introduced by Adobe, this format allows you to take care of the design of a document no matter what application it was made in. Converting pages of text and graphics to PDF results in a compressed and visually clear file that can be read on a Mac or PC, typically using Adobe Reader. PDF is arguably the most prolific digital document format type regardless of industry. You can explore more about it and other formats in the post: Ideal Document File Formats for Digital Document Management.

To acquire documents from a scanner, you naturally need your web application to talk with the scanner via some scanning protocols. This leaves you with two options: you can either spend a lot of time and effort to figure out what the TWAIN standard is or you can use an available off-the-shelf 3rd party SDK. Considering the affordable price and ease of using 3rd party SDK compared to the pain of studying a new protocol that is of hundreds of page long, I will definitely opt to evaluate an SDK. An SDK also makes sense if you are like me and hope to stay focused on your project rather than spend days or weeks on a scanning page.

There are a number of scanning SDKs available. They are all great to use. But as far as I know, given the same provided functionality, Dynamic Web TWAIN is the most user-friendly and most easy-to-deploy SDK on the market. Here’s a disclaimer: I didn’t state this because I work for Dynamsoft. In fact, don’t just take my word for it. Go see for yourself and try some alternatives. You WILL come to understand this is true. 🙂

With Dynamic Web TWAIN, you can build a scanning web page in just 5 minutes to scan documents to a PDF file format. As simple as that.

In this tutorial, I will show you step by step how to build a simple HTML page to scan documents and save them as a PDF file.

There are four simple steps:

  1. Start a Web Application
  2. Add Dynamic Web TWAIN to the HTML Page
  3. Use Dynamic Web TWAIN to Scan or Load Images
  4. Save Images as a PDF file

Right, we’ve made a video before: Making Web-based Document Imaging Apps in 2 Minutes

Perhaps you’ve already seen the video. If so, you might wondering why the video says 2 minutes but this tutorial says 5 minutes? There are two reasons. First, the video we made only covers the scanning part, but in this tutorial we will also handle local images loading and saving to PDF. Second, we wanted to minimize details in the video, to keep it as short as possible. We kept it in mind when we were making the video.

Another thing I would like to point out is that the video was made in 2014. At that time, Dynamic Web TWAIN’s version was 9.2. The current version of Dynamic Web TWAIN is 10.2. 10.2’s coding recommendation has changed a little, although not much, to be more compliant with the HTML standard.

Okay, so much for it. Let’s start our journey.

Step 1: Start a Web Application

Download the Dynamic Web TWAIN 30-day free trial from here.

Install Dynamic Web TWAIN. After your installation, you can find it by default at C:Program Files (x86)DynamsoftDynamic Web TWAIN SDK 10.2 Trial. Dynamic Web TWAIN installation folder Notice there are three folders in it: Documents, Resources, and Samples. The Documents folder is where “help” documents and a “developer’s guide” can be found. The Samples folder obviously has all the provided Dynamic Web TWAIN samples. And, the Resources folder has all the SDK files necessary to build a scanning web page – exactly what we need for our project.

Create an empty HTML page named scan2pdf.html. Copy Dynamic Web TWAIN’s Resources folder to the same place as scan2pdf.html: Dynamic Web TWAIN with an empty page

Step 2: Add Dynamic Web TWAIN to the HTML Page

Okay, how many minutes do we have left? Four minutes! Take it easy.

Include dynamsoft.webtwain.initiate.js and dynamsoft.webtwain.config.js in the HTML head. The dynamsoft.webtwain.initiate.js and dynamsoft.webtwain.config.js handle Dynamic Web TWAIN’s initialization and provide all scanning and image editing related API interfaces. Both files are standard Dynamic Web TWAIN API files maintained and upgraded by Dynamsoft:

<html>  
    <head>    
        <title>Scan Documents to PDF</title>
        <script type="text/javascript" src="Resources/dynamsoft.webtwain.initiate.js"> </script>
        <script type="text/javascript" src="Resources/dynamsoft.webtwain.config.js"> </script>
    </head>

Add a div container for Dynamic Web TWAIN, and register the OnWebTwainReady event to get access to Dynamic Web TWAIN via DWObject:

    <!-- dwtcontrolContainer is the default div id for Dynamic Web TWAIN control.
     If you need to rename the id, you should also change the id in dynamsoft.webtwain.config.js accordingly. -->
    <div id="dwtcontrolContainer"></div>
    <script type="text/javascript">
        Dynamsoft.WebTwainEnv.RegisterEvent('OnWebTwainReady', Dynamsoft_OnReady); // Register OnWebTwainReady event. This event fires as soon as Dynamic Web TWAIN is initialized and ready to be used
        var DWObject;

        function Dynamsoft_OnReady() {
            DWObject = Dynamsoft.WebTwainEnv.GetWebTwain('dwtcontrolContainer'); // Get the Dynamic Web TWAIN object that is embeded in the div with id 'dwtcontrolContainer'
        }

Step 3: Use Dynamic Web TWAIN to Scan or Load Images

Add Scan and Load buttons to the page:

    <input type="button" value="Scan" onclick="AcquireImage();" />
<input type="button" value="Load" onclick="LoadImage();" />

And add the implementation of function AcquireImage() and LoadImage(). Notice how LoadImage() handles success and failure with callback functions OnSuccess() and OnFailure() :

        function AcquireImage() {
            if (DWObject) {
                DWObject.SelectSource();
                DWObject.OpenSource();
                DWObject.IfDisableSourceAfterAcquire = true;	// Scanner source will be disabled/closed automatically after the scan.
                DWObject.AcquireImage();
            }
        }

        //Callback functions for async APIs
        function OnSuccess() {
            console.log('successful');
        }

        function OnFailure(errorCode, errorString) {
            alert(errorString);
        }

        function LoadImage() {
            if (DWObject) {
                DWObject.IfShowFileDialog = true; // Open the system's file dialog to load image
                DWObject.LoadImageEx("", EnumDWT_ImageType.IT_ALL, OnSuccess, OnFailure); // Load images in all supported formats (.bmp, .jpg, .tif, .png, .pdf). OnSuccess or OnFailure will be called after the operation
            }
        } 

Step 4: Save Images as a PDF file

We still have 1 minute left, don’t we? Now we have two options to get documents loaded into Dynamic Web TWAIN:

  • Scan documents from a scanner (AcquireImage());
  • Or load hard disk documents (LoadImage()).

It’s time to add a save button to the web page:

<input type="button" value="Save" onclick="SaveWithFileDialog();" />

Add the logic of saving documents to PDF:

        function SaveWithFileDialog() {
            if (DWObject) {
                if (DWObject.HowManyImagesInBuffer > 0) {
                    DWObject.IfShowFileDialog = true;
                    DWObject.SaveAllAsPDF("DynamicWebTWAIN.pdf", OnSuccess, OnFailure);
                }
            }
        }

Now, save the file.

That’s it. Congratulations. You have just built a web page in around 5 minutes that can scan or load documents and save them as a PDF file.

You can open scan2pdf.html in a browser and test it out.

You can either load a local document or scan documents into your web page. Let’s try scanning. This is how the page looks like when Scan button is clicked:

Dynamic Web TWAIN scan page

Please note that only TWAIN compatible scanners will be listed in the Select Source dialog. If you don’t have a real scanner at hand, you can install a virtual scanner for the testing, which is what I did. If you do have a scanner, but it doesn’t show up in the list, please check this article for a solution.

After a sample page is scanned, it looks something like this:
Dynamic Web TWAIN scan page after scanning And yes, you can save it as a PDF file by clicking Save button.

Here’s the entire code example:

<html>
<head>
    <title>Scan Documents to PDF</title>
    <script type="text/javascript" src="Resources/dynamsoft.webtwain.initiate.js"></script>
    <script type="text/javascript" src="Resources/dynamsoft.webtwain.config.js"></script>
</head>
<body>
    <input type="button" value="Scan" onclick="AcquireImage();" />
    <input type="button" value="Load" onclick="LoadImage();" />
    <input type="button" value="Save" onclick="SaveWithFileDialog();" />
    <br />

    <!-- dwtcontrolContainer is the default div id for Dynamic Web TWAIN control.
     If you need to rename the id, you should also change the id in dynamsoft.webtwain.config.js accordingly. -->
    <div id="dwtcontrolContainer"></div>

    <script type="text/javascript">
        Dynamsoft.WebTwainEnv.RegisterEvent('OnWebTwainReady', Dynamsoft_OnReady); // Register OnWebTwainReady event. This event fires as soon as Dynamic Web TWAIN is initialized and ready to be used

        var DWObject;

        function Dynamsoft_OnReady() {
            DWObject = Dynamsoft.WebTwainEnv.GetWebTwain('dwtcontrolContainer'); // Get the Dynamic Web TWAIN object that is embeded in the div with id 'dwtcontrolContainer'
        }

        function AcquireImage() {
            if (DWObject) {
                DWObject.SelectSource();
                DWObject.OpenSource();
                DWObject.IfDisableSourceAfterAcquire = true;	// Scanner source will be disabled/closed automatically after the scan.
                DWObject.AcquireImage();
            }
        }

        //Callback functions for async APIs
        function OnSuccess() {
            console.log('successful');
        }

        function OnFailure(errorCode, errorString) {
            alert(errorString);
        }

        function LoadImage() {
            if (DWObject) {
                DWObject.IfShowFileDialog = true; // Open the system's file dialog to load image
                DWObject.LoadImageEx("", EnumDWT_ImageType.IT_ALL, OnSuccess, OnFailure); // Load images in all supported formats (.bmp, .jpg, .tif, .png, .pdf). OnSuccess or OnFailure will be called after the operation
            }
        }

        function SaveWithFileDialog() {
            if (DWObject) {
                if (DWObject.HowManyImagesInBuffer > 0) {
                    DWObject.IfShowFileDialog = true;
                    DWObject.SaveAllAsPDF("DynamicWebTWAIN.pdf", OnSuccess, OnFailure);
                }
            }
        }
    </script>
</body>
</html>

One Step Further

The example above is simple and functions well. But sometimes, you may like to take things a step further. For example, how about automatically saving documents as a PDF without having to manually click the save button?

With Dynamic Web TWAIN’s event mechanism, it’s actually fairly easy to do.

Dynamic Web TWAIN offers a number of events for users to subscribe to. Events are triggered when certain trigger points are reached. For example, we have an OnMouseClick event for mouse clicking, an OnPostTransfer event for the end of transferring one image, etc.

So at the end of function Dynamsoft_OnReady(), simply add:

            if (DWObject) {
                DWObject.RegisterEvent('OnPostAllTransfers', SaveWithFileDialog);
            }

This will do the job.

Conclusion

This is a simple example to show you how to use Dynamic Web TWAIN to scan documents and save them as a PDF file. If you are interested in learning more about how Dynamic Web TWAIN can help your project, I would recommend you read the developer’s guide.

Let me know your thoughts and questions in the comments.

P.S. if you need to save to other image formats, such as BMP, JPEG, TIFF, you can find the exact code sample in the Dynamic Web TWAIN trial package. You will, won’t you?

Subscribe Newsletter

Subscribe to our mailing list to get the monthly update.

Subscribename@email.com