APIs for Image Capture Applications

Dr. Dobb's Dynamsoft contributed an article to Dr. Dobb’s Journal that ran June 17, 2013. Start reading it below. TWAIN, WIA, and DirectShow are three popular APIs for imaging apps. How do they differ and which one should you pick?

Online image-scanning applications continue to grow in importance. More organizations need to digitize documents for recordkeeping, safekeeping, and archiving among other reasons. As the practice of document digitizing grows, so does the need for greater options in image capturing. A useful image-capturing application should deliver simple editing and upload features. It should also enable the use of as many complimentary devices as possible: scanners, digital cameras, capture cards, webcams, and so on. Developers looking to build image-capture applications have several APIs from which they can choose: developing for TWAIN, for Windows Image Acquisition (WIA,) or for DirectShow, for example. All three of these APIs sit between applications and digital devices.

Overview of APIs

TWAIN, created by the nonprofit TWAIN Working group, iincludes support for devices such as scanners and digital cameras and is supported in operating systems such as Microsoft Windows, Mac OS X, and Linux. TWAIN is designed primarily for C/C++ development. TWAIN Figure 1: How TWAIN works between image acquisition applications and devices. WIA is a Microsoft driver model and API for Microsoft Windows, which has been around since the days of Window Me, but also works with newer Windows operating systems. In Windows Me, WIA enabled graphics software to communicate with imaging hardware such as scanners, digital cameras, and digital video equipment. Since that release in 2000, Microsoft has steadily added features, including OLE integration. With the release of Windows Vista, however, WIA has been more tightly targeted towards scanners. WIA is also designed primarily for C/C++ development. WIA Figure 2: How Microsoft WIA works. DirectShow is a multimedia framework and API produced by Microsoft. Software developers can use it to perform various operations with media files or media streams. DirectShow is a replacement for Video for Windows (VFW), also known as Video Compressions Manager (VCM). Most webcams, including FireWire cameras, support the interfaces of DirectShow. Developers should note that while USB Video Class (UVC) cameras have the most marketshare, FireWire cameras still occupy an important place in certain segments. For example, in security or industrial applications, users tend to prefer FireWire cameras to USB cameras. DirectShow supports many file types, including: Advanced Systems Format (ASF), Windows Media Audio (WMA), Windows Media Video (WMV), AIFF, AU, Audio-Video Interleaved (AVI), MIDI, SND, and WAV. DirectShow is designed primarily for C++ development.

Comparing APIs

A popular misconception exists about TWAIN; namely, that it is too old for modern scanning. However, TWAIN, which was first developed in 1992, is actually a sophisticated API. Not only is it portable across many operating systems, it enables device vendors to create a customized user interface for each driver. In contrast, WIA uses a common user interface for all devices. TWAIN has three transfer modes (native, memory, file) while WIA has only two (memory, file). Additionally, WIA provides a TWAIN compatibility layer that allows TWAIN-aware applications to communicate with WIA devices. TWAIN is typically an ideal choice for applications such as scanners, due to its flexibility and features. However, for webcams, WIA or DirectShow are more appealing due to a larger spectrum of supported devices. The newer the device, the more likely the device vendor is to support WIA or DirectShow over TWAIN compatibility. DirectShow is perceived as one of the most complex Microsoft libraries. The use of an SDK will save developers a lot of work. Anyone who tries to explore the core interface of DirectShow will find it pretty difficult to learn, as mastery of complex intricacies, such as COM interfaces, is required. For this reason, many developers who choose to work with DirectShow turn to third-party SDKs to accelerate development and integration. You can find several third-party development tools online with a simple search, inlcuding both open-source and commercial solutions.

Getting Started in Development Without an SDK

SDKs can significantly cut development time, but some folks will still chose to start from scratch. The benefit of going this route is greater flexibility to include and exclude various capabilities. When developing from scratch, note the following:

  • TWAIN: The TWAIN architecture consists of four layers: application, protocol, acquisition, and device. Developers get their applications to communicate with scanners and/or other devices through the protocol layer. You can find more information about all the available capabilities at the TWAIN website.
  • DirectShow: If you are planning to build a DirectShow application from scratch, you’ll need at least basic knowledge of C++/COM programming.
  • WIA: WIA uses the Windows Driver Model (WDM) architecture. Application developers can use WIA to call a set of unique capabilities that enable an application to communicate with WIA-compliant devices already running on Windows.

  In addition, take advantage of forums such as The TWAIN forum, Stack Overflow, and MSDN for tips on getting started and avoiding pitfalls. Whether using an SDK or developing from scratch, there is plenty of openly available knowledge to help you create your own image-capture applications. And as more organizations turn to digitization for document management, the need for these applications will continue to grow. Dr. Dobb’s


April 17, 2015 - If you have an interest to know the difference of TWAIN, WIA, ISIS, and SANE, you can check the article: Document Scanning: TWAIN, WIA, ISIS or SANE?

Subscribe Newsletter

Subscribe to our mailing list to get the monthly update.