Dynamsoft Blog

The leading provider of version control solutions and TWAIN SDKs

How to Upload Scanned Images to Amazon S3 Using Dynamic Web TWAIN SDK

In this tutorial, I’d like to share how to upload images, captured with Dynamic Web TWAIN(DWT), to Amazon S3. The adopted techniques include Dynamic Web TWAIN SDK, Amazon S3 REST API, JavaScript and PHP. You need to get familiar with DWT SDK and the Amazon S3 stuff (e.g., Bucket, Access Key, Secrete Key, Policy and so on).

As stated in Amazon S3 official documents, S3 supports HTTP Post request from client directly as long as the request is associated with the correct authenticated information about the policy and signature. To upload the image data captured by DWT SDK, you will need to:

1) Create Web pages contain DWT SDK, policy and signature information (encrypted)
2) Once scanning process finished, read the acquired image to a buffer, convert the buffer into BLOB, and send the BLOB data to S3.

Dynamic Web TWAIN Amazon S3 Demo

Upload images to Amazone S3

How to Send Data to Amazon S3

On the server side, you can write a web page which includes following code:

<?php
$bucket = "XXX"; //Specify the bucket which is the place or the folder name used for storing data on Amazon S3
$accesskey = "YourOwnAccessKey"; 
$secret = "YourOwnSecretKey"; //Specify the Access Key and Secret Key you obtained from your Amazon S3 account. You should keep this confidential

$policy = json_encode(array(
    'expiration' => '2018-08-06T12:00:00.000Z',
    'conditions' => array(
        array(
            'bucket' => $bucket
        ),
        array(
            'starts-with',
            '$key',
            'Events/'
        ),
        array(
            'acl' => 'public-read'
        )
    )
)); //Create a policy that specifies what you permit and what you don’t permit for the data uploaded from a client web page
$base64Policy = base64_encode($policy);
$signature = base64_encode(hash_hmac("sha1", $base64Policy, $secret, $raw_output = true)); //Encode and encrypt these policies and signatures to keep them confidential, and store the encoded and encrypted values in the hidden input elements.
?>

The code for the client side is trickier and might be error-prone if you are not familiar with S3 rules. Check the code below:
HTML

<input type="hidden" name="AWSAccessKeyId" value="<?php echo $accesskey; ?>">
<input type="hidden" name="policy" value="<?php echo $base64Policy; ?>">
<input type="hidden" name="signature" value="<?php echo $signature; ?>">
 
<li style="text-align: center">
<input id="btnUploadS3" type="button" value="Upload Image To S3" onclick ="btnUploadS3_onclick()"/></li>

JAVASCRIPT

function btnUploadS3_onclick() {
    if (!checkIfImagesInBuffer()) {
        return;
    }
 
    var i, strHTTPServer, strActionPage, strImageType;
    _txtFileName.className = "";
    if (!strre.test(_txtFileName.value)) {
        _txtFileName.className += " invalid";
        _txtFileName.focus();
        appendMessage("Please input <b>file name</b>.<br />Currently only English names are allowed.<br />");
        return;
    }
 
    for (i = 0; i < 4; i++) {
        if (document.getElementsByName("ImageType").item(i).checked == true) {
            strImageType = i + 1;
            break;
        }
    }
    var uploadfilename = _txtFileName.value + '.' + document.getElementsByName("ImageType").item(i).value;
 
    DWObject.SelectedImagesCount = 1;
var size = DWObject.GetSelectedImagesSize(strImageType);
var ary = new Array(size);
    for (var i =0; i<size; i++)
        ary[i] = 0;
    DWObject.SaveSelectedImagesToBytes(size, ary); // save the image data into buffer
 
// make xmlHttpRequest and send the binary data to S3
strActionPage = "http://s3.amazonaws.com/XXX/"; //The XXX would be your bucket name that you would like the uploaded data to be stored
    var UTF8_STR = new Uint8Array(ary);       // Convert to UTF-8...                
    var BINARY_ARR=UTF8_STR.buffer;         // Convert to buffer...    
    var dataBlob = new Blob([BINARY_ARR]); 
 
    var fd = new FormData();
    uploadfilename = 'Events/' + uploadfilename;
    // The common fields you need to specify include: KeyAWSAccessKeyIdAclPolicy,Signature and File. Note: the order of the fields (key, acl, policy, signature, and file should be the last one) matters. If you disorder them, you will fail to upload the image data.
    fd.append('key', uploadfilename);
    fd.append('AWSAccessKeyId', document.getElementsByName("AWSAccessKeyId").item(0).value);
    fd.append('acl', 'public-read');
    fd.append('policy', document.getElementsByName("policy").item(0).value);
    fd.append('signature', document.getElementsByName("signature").item(0).value);          
    fd.append("file",  dataBlob, uploadfilename);
 
    var xhr = new XMLHttpRequest();
    xhr.open('POST', strActionPage);
    xhr.addEventListener('load', function(e) {
        console.log('uploaded!', e)  // Successful upload!
    });
    xhr.send(fd);
}

Source Code

You can get the complete source code of the sample here.

To use Dynamic Web TWAIN to embed web scanning and upload to Amazon S3 cloud in your own application, you can download the 30-day free trial.

Free Online OCR Service with Dynamsoft TWAIN SDKs

Dynamsoft provides a free online OCR tool for end-users. Free Online OCR OCR_DWT

How to Implement the Online OCR Tool

While doing OCR online, there are four basic steps.

Step 1: load images

Using Dynamic Web TWAIN SDK, images can be loaded into an image view component with a few lines of JavaScript code:

function btnLoad_onclick() {
    _iHowManyImages = DWObject.HowManyImagesInBuffer;
    DWObject.IfShowFileDialog = true;
    DWObject.LoadImageEx("", 5, function() {
        g_DWT_PrintMsg("Loaded an image successfully.");
    }, function() {
    });

    _iErrorCode = DWObject.ErrorCode;
    _strErrorString = DWObject.ErrorString;
}

Step 2: select a language

To create language and result options, just use HTML5 element <select>:

<select id=" ">
<option value=" "> </option>
<option value=" "> </option>
</select>

OCR_DWT_Result   Read more

Want to See How UGL Boosted Productivity after Migration to SourceAnywhere Version Control from Visual SourceSafe?

 Our developers have found the migration to using SourceAnywhere to be quite smooth, with minimal retraining required. In addition, we are benefiting far more in productivity because of SourceAnywhere’s significantly improved performance, particularly when using remote access.”

Bhargav Srinivas, 
Senior Project / Systems Engineer, UGL

UGL Logo

SUMMARY

UGL® Limited, a global diversified services company with end-to-end outsourced engineering, construction, asset management and maintenance services resulting in annual revenue of $2.2+ billion has software developer teams that rely on a collaborative environment. For a long time, UGL used Microsoft® Visual SourceSafe software as a main repository of development activities. But, the software became old and was no longer supported – it stopped meeting UGL’s performance requirements. The company had to find a new source control solution to maintain a highly collaborative environment for developers regardless of their locations. UGL found Dynamsoft™ and its SourceAnywhere™ source control solution. SourceAnywhere was selected due to its similarities in operation to the previously used product. It also had available migration tools to enable an easy transition from SourceSafe. SourceAnywhere has proven itself in providing a significant boost in overall performance, particularly in security and remote access.

THE OBSTACLE

Software development teams from UGL’s engineering, fabrication, modeling, technical services and other related business sectors rely on a collaborative environment. To this end, the company has long employed the use of version control software to help foster developer collaboration. Over the years UGL relied on Microsoft® Visual SourceSafe software as the main repository of development activities.  But, it was heavily a flat file systems and support for it has ceased. And, the official replacement has been found to not suit UGL’s needs. The software has become quite slow to access repositories from satellite sites.  The team was able to improve remote site access. However, this was a limited improvement and required third party caching products to integrate with SourceSafe.

UGL had to find a new source control solution to maintain a highly collaborative environment for its developers regardless of their locations. As a result, strong security and performance for remote access would be key features to have in the new solution.

Read the full case study>>

Device Security: Factors for Changing the Sandbox Security Mechanism

It seems more and more sensors are being used in wireless communication modules with each new version of a device. We have accelerometers, gyroscopes, compasses, and more just in smartphones. These components are often integrated with other components – such as, Bluetooth, Wi-Fi, NFC, etc. – for enhanced data sharing. Such functionalities are helping to usher in the new age of the Internet of things (IoT). But, as devices and their applications share more data with each other, security risks increase. Manufacturers are addressing this in part by having implemented another layer of security: the Sandbox mechanism.

Apple Pay

Apple recently rolled out Apple Pay with the hopes to push and popularize e-wallet payments based on NFC. It is known that Apple uses the Sandbox technique to secure its applications. On iOS, this is done at the OS level. Sandbox is a security method used to isolate running programs from each other at an application or OS level. With it, developers can restrict applications or devices from accessing certain OS resources. It can add a layer of protection for user data when hackers exploit vulnerabilities in an applications or systems.

More companies like Apple, Google and Microsoft are moving to secure systems and web browsers with Sandbox. But, it may impact end user and developer behavior. For example, some users would rather root their devices to obtain more system rights, which circumvents Sandbox. If users or developers are willing to squander an added Sandbox layer of security, another question begs asking. Is it necessary to enforce a Sandbox technique on an OS or with web browsers?

Sandbox for Mobile

The capabilities of applications running in Sandbox mode can be extremely constricted. But, explicit policies can be setup to grant permissions. For example, an application can be allowed to access key system features and specific user data. Apple and Android devices provide a Sandbox mechanism. It’s popularly known that Android is more flexible in allowing applications to obtain greater permissions. Almost anyone who has downloaded an app is familiar with how Sandbox is implemented. When an app is installed, sometimes a user will be prompted to review permissions and decide to enable or disable them. Thus, he or she implemented Sandbox policies for that app.

Android’s enhanced security flexibility includes more possibilities without rooting. For example, users aren’t always restricted to downloading apps only from Google Play. They can download and install them from unknown or other sources. While it can present a greater security risk, this flexibility does help satisfy additional user requirements. Obviously there is a tradeoff between having more restrictions and being more open. It’s commonly known iOS is the most locked-down mobile operating system. Apple has restricted access to the OS far more than Google has for Android. In a way, iOS more automatically protects users whereas on Android, most users have to judge and apply restrictions. In some cases, users might not even understand what permissions they may or may not want to grant. So, it becomes easily arguable that an Android device is at much greater risk at getting infected by a virus or worm compared to an iOS devices. But, it’s also easily arguable that Android is more flexible in application use.

The industry needs better balance. A new question has to be addressed. How can manufacturers better balance an even more flexible user experience while enabling an even safer environment? The Sandbox technique must adapt to address this necessary shift.

Sandbox for Desktops

The Windows and Mac OSes provide official application stores for users and the stores employ the Sandbox mechanism. For example, by default in Mac OS X unverified third-party applications are now disallowed. But Sandbox on a desktop OS is different from a mobile OS. Mobile OSes were born with Sandbox. The desktop was born more open. For users who touched iOS before Mac OS, Sandbox comes across more acceptable and even convenient. But, for old users and developers that lived on Mac OS first, it is a little bit odd and harder to accept.

On Windows 8, the oddness is a bit different. For example, we can install Skype as a Windows App from their app store with Sandbox enforcement. But, we can also install it from the Skype website without Sandbox enforcement. The app version is different than the desktop version in other ways too. But, to users, Skype on a single machine should probably just be Skype. It’s difficult to grasp the same application appearing twice, operating slightly different from one another, including permissions. For many years on desktops, people were able to install applications with a few or no restrictions, let alone prompts to allow this or not allow that. Thus, it’s likely that on desktops the Sandbox mechanism will be an alternative but, not a replacement security measure.

Sandbox for Web Browsers

HTML5 technology is fast becoming mainstream in web browsers. As a result, leading browser developers are abandoning certain components. We’ve seen IE abandon ActiveX and Chrome abandon NPAPI, etc. More and more, web browsers are using Sandbox to replace old security techniques and disallow plugins from directly accessing system resources.

Sandbox can provide end users a more secure browsing environment. But, Sandbox might also be inconvenient for users and developers. Let’s explain. When web browsers automatically update in the background, users will likely not be aware of what web browser features have been changed. That is until they suddenly cannot access or use certain features or functionalities that previously worked. For example, an online banking systems might have been using ActiveX for password input and verification. If ActiveX is – unknowingly to the user – abandoned, how will users access their account? Thus, it’s likely that banks are in no rush to update their systems.

As more web browsers move to enforce the Sandbox mechanism, developers will have to figure out new plugin solutions. For example, WebSocket might be the answer to upgrade from old to new as soon as possible.

What’s the Path?

The Sandbox mechanism has proven itself excellent for securing devices, though not without hurdles. Developers by now fully understand it comes with tradeoffs. The main one is more application and device restrictions which obviously results in less freedom. Thus, it is hard to cover all user requirements. There’s no doubt that providing multiple options for the Sandbox security mechanism is ideal. It’s undeniable that more and more devices will talk to one another. The same is true of their applications. Thus, developers should move to face the challenges in Sandbox security now rather than later.

Building a Document Management Solution: Do it from Scratch or Use 3rd Party SDKs?

Buy or Build

Buy it or build it? Today, this is an age-old question in the IT world. The mere outcome of one versus the other can yield grand differences in the scope of desired benefits, cost, and time to accomplish. This major decision is one many organizations also face when deciding upon a document management solution (DMS). So, should you really build your own DMS application entirely from scratch? Buying one is likely simple enough but, it also can limit you on desired features. So, if one opts to build a DMS, is help available to undertake such a task?

Other Initial Questions
There is a lot to figure out when you’re just starting out with deciding to buy or build a document management solution. You must understand, at a strategic level, what core competencies you expect from the application. This is in addition to comprehending tactical underpinnings that will make up the underlying processes for achieving common tasks. Of course, you also need to consider time-to-market or time-to-first-use. For time-to-market, can you build it all from scratch and meet your development deadlines? If it’s time-to-first-use, you have to consider how urgently you need to start using it. You’ll want to weigh these things against the necessary time and resources to properly execute the software.

Once you elect to start building it, even more questions start to pop up. Have you allocated enough R&D resources to do the related work? For example, will you need to adopt and implement technology standards and if so, do you have a full understanding of those standards? If not, how much time will need to be spent educating oneself on necessary standards to correctly implement them? With document management solutions, many standards come into play, from image acquisition interfaces to file extension types. You have to also thoroughly understand and make sure you know your true cost of ownership over the lifecycle of the software. For example, can you accurately account for staffing six months or a year or more from now? It’s critical because the staff needs to provide continuous technical support for each component of the software. As we all know, the cost to build or own software extends beyond initial development or purchasing. Technical support, upgrades, scalability – all of these elements can add surprise long-term costs.

Going It Alone But, With Help
Those that opt to build their own solution often do so for obvious reasons, one of which is flexibility to customize as needed. It’s important to note that one can opt to purchase certain pre-built components. This can save extensively on development costs and time while still allowing the full flexibility of custom-built solutions. If you’re building a house from scratch, you’ll probably purchase pre-built windows, doors, and fixtures. This saves extra money and time you would have otherwise spent on extra sanding, cutting, measuring, etc. Just the same, software developers commonly opt to use an available off-the-shelf database to not fuss around with trying to design one. One might also use software development kits (SDK) for other components, such as for the interface to conduct document scans and processing. Building your own database and image capture module can be daunting. For image capture there are industry standards to comprehend that are hundreds of pages long. You really shouldn’t even begin to code without full comprehension of these standards. Then there is the code itself – it can be hundreds to more than a thousand lines of codes of additional work. Building an image capture component yourself can add months of extra development time and costs.

Let’s get more specific. The TWAIN application programming interface (API) is one of the most popular communications protocols to regulate interfacing between software and digital imaging devices. So, there’s work to be done to know how to properly support this one standard. You’ll start by learning the 600+ pages that make up the TWAIN specification. This is so you can become familiar with how to use TWAIN to talk to imaging devices, such as scanners. Understanding TWAIN to develop related scanner programming is essential. So, it’s no wonder many programmers opt to use SDKs for specific components. This is a very common practice in the document management software market. The use of a document imaging SDK, such as Dynamic Web TWAIN, can allow the programmer to implement just a couple of lines of code to start calling the TWAIN API for scanning in a web application. It turns months of work into just hours or a few days. It also helps keep coding clean – the use of an SDK can reduce your code development to just a few lines. If you’ve opted to build your own software, SDKs make very convenient options when time or costs are a concern.

Maintain Focus
Another key reason many organizations opt to use SDKs is that it allows them to maintain focus. Often, a document management solution is the request of a client to a software development shop. That shop might, for example, provide expertise in software tailored to an industry, such as healthcare or finance.

So, while their healthcare client has requested a document management solution from them, building every component can pull them away from their focus on healthcare software and services. For example, coding together a document scanning module is likely not a core competency. So, building this might defocus a shop and add a lot of undesirable cost to the project. In this way, an SDK vendor lets a shop stay focused and keep client costs low.

Pointers on SDK Selection
OK. You’ve decided to build the document management software and use an SDK to help implement the image capture component. So, how can you be sure you pick the correct SDK? There are a few things to consider. One obvious thing to do is to check the background and stability of the SDK vendor. Do they have plenty of customer referrals and how long has their solution been available (is it mature)? Make a checklist for features you want and check it twice. Does the SDK support all or most of the features you need? What about integration? How easy can the SDK be integrated into your new or existing document management software workflow? Do the SDK’s image acquisition capabilities have library support for the essentials, like TWAIN, scanner, webcam, .NET, etc.

Finally, you need to check out support options for the SDK as well as migration paths to newer versions. Remember that standards come and go. For example, the use of NPAPI plugins for browsers are being displaced in favor of HTML5 versions. What side of the fence will your software sit on and if you jump the fence, will your SDK provider allow you to seamlessly migrate?

Scratching Your Head
Building completely from scratch can ultimately leave one scratching their head – why did I opt to go this route? It’s not uncommon that critical and time-consuming steps are forgotten or even abandoned because of their difficulty. Remember, developing from scratch without an SDK will mean hundreds to thousands of more lines of code and many more months of work. You then have to thoroughly test your solution prior to deployment, then test it again after deployment and with each update. Don’t forget about training staff, from the development stages to the usage stages. You have to also make sure your resources are up to the task. Will adding months more in work to finish the solution defocus you too much and are certain staff going to be okay with this? Will management be okay with taking people of core tasks? What about when you have to support the software? Are you up to the task to provide continuous technical support? Have you considered if you might be better off with key components instead being fully supported by a reliable SDK vendor?

In the end, most project managers and developers realize the best path is to stay focused on what you do best in-house and get help with the rest. It truly saves a lot in time, costs and headaches.

Copyright © 2014 Dynamsoft. All Rights Reserved. Privacy Statement | Site Map