How to Build a macOS Camera Barcode Scanner in C++ Using AVFoundation

This is the final article in a series on building a lightweight C++ camera library across multiple platforms. We’ve already covered Windows and Linux, and now it’s time to dive into macOS. In this article, we’ll tap into AVFoundation to handle camera capture under the hood and leverage Cocoa for our UI essentials. You’ll see how to bridge Objective-C with C++ to build a camera library, and then integrate it with the Dynamsoft Barcode Reader SDK to create a barcode scanner on macOS.

What you’ll build: A native macOS barcode scanner application that uses AVFoundation for camera capture, Cocoa for the preview window, and Dynamsoft Barcode Reader SDK for real-time barcode decoding — all written in C++ with Objective-C bridging.

Key Takeaways

  • AVFoundation’s AVCaptureSession and AVCaptureVideoDataOutput provide direct camera access on macOS without requiring OpenCV.
  • Objective-C++ (.mm files) lets you mix Objective-C framework calls with C++ application logic in a single translation unit.
  • BGRA-to-RGB pixel conversion is required when passing AVFoundation frames to barcode SDKs that expect RGB input.
  • The Dynamsoft Barcode Reader SDK integrates with any raw-frame pipeline — pass the RGB buffer and receive decoded barcode results.

Common Developer Questions

  • How do I access the macOS camera from C++ without OpenCV?
  • How do I bridge Objective-C and C++ to use AVFoundation in a CMake project?
  • How do I build a real-time barcode scanner on macOS using AVFoundation and Dynamsoft Barcode Reader?

Prerequisites

macOS Camera Demo Video

Implement Camera Capture with AVFoundation

Let’s review the camera-related functions that need to be implemented for macOS:

  • std::vector<CaptureDeviceInfo> ListCaptureDevices(): Enumerates available cameras.
  • bool Open(int cameraIndex): Activates a specified camera.
  • void Release(): Releases the camera.
  • FrameData CaptureFrame(): Captures a frame from the camera.

Update the Camera Header File for macOS

Open the Camera.h header file and add the following changes:

class CAMERA_API Camera
{
public:
#ifdef _WIN32
    ...
#elif __APPLE__
    Camera() noexcept; 
    ~Camera();
#endif

    ...

private:
    ...

#ifdef __APPLE__
    void *captureSession; 
    void *videoOutput;
#endif
};

#endif 

Explanation

  • noexcept is used to indicate that the function does not throw exceptions.
  • captureSession and videoOutput are used to manage the camera capture session and video output.

Enumerate Available Cameras

Enumerate available cameras using the AVFoundation API:

std::vector<CaptureDeviceInfo> ListCaptureDevices()
{
    @autoreleasepool {
        std::vector<CaptureDeviceInfo> devicesInfo;

        NSArray<AVCaptureDevice *> *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
        for (AVCaptureDevice *device in devices)
        {
            CaptureDeviceInfo info = {};
            strncpy(info.friendlyName, [[device localizedName] UTF8String], sizeof(info.friendlyName) - 1);
            devicesInfo.push_back(info);
        }

        return devicesInfo;
    }
}

Explanation

  • AVCaptureDevice represents a physical capture device.
  • AVMediaTypeVideo specifies the video media type.
  • localizedName retrieves the device’s name.

Open a Camera with AVCaptureSession

The steps to open a camera are as follows:

  1. Get the available capture devices.
  2. Create a device input and a capture session.
  3. Configure the video output.
  4. Start the capture session.
bool Camera::Open(int cameraIndex)
{
    @autoreleasepool {
        NSArray<AVCaptureDevice *> *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
        if (cameraIndex >= [devices count])
        {
            std::cerr << "Camera index out of range." << std::endl;
            return false;
        }

        AVCaptureDevice *device = devices[cameraIndex];

        NSError *error = nil;
        AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];
        if (!input)
        {
            std::cerr << "Error creating device input: " << [[error localizedDescription] UTF8String] << std::endl;
            return false;
        }

        AVCaptureSession *cs = [[AVCaptureSession alloc] init];
        captureSession = (void *)cs;
        if (![cs canAddInput:input])
        {
            std::cerr << "Cannot add device input to session." << std::endl;
            return false;
        }
        [cs addInput:input];

        AVCaptureVideoDataOutput *output = [[AVCaptureVideoDataOutput alloc] init];
        output.videoSettings = @{(NSString *)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_32BGRA)};
        output.alwaysDiscardsLateVideoFrames = YES;

        videoOutput = (void *)output;

        if (![cs canAddOutput:output])
        {
            std::cerr << "Cannot add video output to session." << std::endl;
            return false;
        }
        [cs addOutput:output];

        [cs startRunning];

        return true;
    }
}

Explanation

  • AVCaptureDeviceInput captures data from the chosen camera device.
  • AVCaptureSession manages the flow of data from the device to the output. We store it in captureSession.
  • AVCaptureVideoDataOutput provides the raw video frames from the capture device. We store it in videoOutput.
  • kCVPixelBufferPixelFormatTypeKey specifies the pixel format type.

Release the Camera Session

Stop the capture session and release the resources:

void Camera::Release()
{
    if (captureSession)
    {
        AVCaptureSession *session = (__bridge AVCaptureSession *)captureSession;
        
        if (videoOutput)
        {
            AVCaptureVideoDataOutput *output = (__bridge AVCaptureVideoDataOutput *)videoOutput;
            [session removeOutput:output];
            videoOutput = nil;
        }

        [session stopRunning];
        captureSession = nil;
    }
}

Capture a Video Frame as RGB Data

Capture a frame from the camera using the AVCaptureVideoDataOutputSampleBufferDelegate protocol:

@interface CaptureDelegate : NSObject <AVCaptureVideoDataOutputSampleBufferDelegate>
{
    FrameData *frame;
    dispatch_semaphore_t semaphore;
}

- (instancetype)initWithFrame:(FrameData *)frame semaphore:(dispatch_semaphore_t)semaphore;

@end

@implementation CaptureDelegate

- (instancetype)initWithFrame:(FrameData *)frame semaphore:(dispatch_semaphore_t)semaphore {
    self = [super init];
    if (self) {
        self->frame = frame;
        self->semaphore = semaphore;
    }
    return self;
}

- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
       fromConnection:(AVCaptureConnection *)connection {
    
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);

    if (!imageBuffer) {
        std::cerr << "Failed to get image buffer." << std::endl;
        dispatch_semaphore_signal(semaphore);
        return;
    }

    CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    
    frame->width = width;
    frame->height = height;
    frame->size = width * height * 3;
    frame->rgbData = new unsigned char[frame->size];

    OSType pixelFormat = CVPixelBufferGetPixelFormatType(imageBuffer);

    if (pixelFormat == kCVPixelFormatType_32BGRA) {
        unsigned char *src = (unsigned char *)baseAddress;
        unsigned char *dst = frame->rgbData;

        for (size_t y = 0; y < height; ++y) {
            for (size_t x = 0; x < width; ++x) {
                size_t offset = y * bytesPerRow + x * 4;
                dst[0] = src[offset + 2]; 
                dst[1] = src[offset + 1]; 
                dst[2] = src[offset + 0]; 
                dst += 3;
            }
        }
    } else {
        std::cerr << "Unsupported pixel format." << std::endl;
    }

    CVPixelBufferUnlockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
    dispatch_semaphore_signal(semaphore);
}


@end

FrameData Camera::CaptureFrame()
{
    @autoreleasepool {
        FrameData frame = {};

        if (!captureSession || !videoOutput) {
            std::cerr << "Capture session is not initialized." << std::endl;
            return frame;
        }

        AVCaptureSession *session = (__bridge AVCaptureSession *)captureSession;
        AVCaptureVideoDataOutput *vo = (__bridge AVCaptureVideoDataOutput *)videoOutput;
 
        dispatch_semaphore_t semaphore = dispatch_semaphore_create(0);

        [vo setSampleBufferDelegate:[[CaptureDelegate alloc] initWithFrame:&frame semaphore:semaphore]
                                       queue:dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)];

        dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);

        frameWidth = frame.width;
        frameHeight = frame.height;
        return frame;
    }
}

Explanation

  • setSampleBufferDelegate sets the object that will receive frames from the AVCaptureVideoDataOutput.
  • FrameData *frame points to a struct where the captured image data is stored (width, height, and the RGB buffer).
  • dispatch_semaphore_t semaphore is used for synchronization. Once the frame is processed, the delegate signals the semaphore so the calling code knows the frame is ready.
  • In captureOutput:didOutputSampleBuffer:fromConnection, we extract the pixel buffer and convert BGRA to RGB (3 bytes per pixel).
  • CMSampleBufferGetImageBuffer(sampleBuffer) returns a CVImageBufferRef containing the pixel data.
  • CVPixelBufferLockBaseAddress and CVPixelBufferUnlockBaseAddress ensure thread safety during reading.
  • dispatch_semaphore_signal(semaphore) notifies whoever is waiting on this semaphore that the frame data is now ready.

Build the Cocoa Preview Window

Update the CameraPreview Header for macOS

Add macOS-specific members to the CameraPreview.h header file:

class CAMERA_API CameraWindow
{
public:
    ...

#ifdef _WIN32
    ...
#elif __APPLE__
    void *nsWindow; 
    void *contentView;
#endif
};

Explanation

  • nsWindow and contentView are used to manage the window and the corresponding content view on macOS.

Create a Custom NSView for Rendering Frames, Contours, and Text

Create a custom NSView subclass to handle drawing:

struct CameraContentViewImpl {
    std::vector<unsigned char> rgbData;
    int frameWidth = 0;
    int frameHeight = 0;
    int x = 0;
    int y = 0;
    int fontSize = 0; 
    std::vector<std::pair<int, int>> contourPoints;
    std::string displayText;
    CameraWindow::Color textColor;
};

@interface CameraContentView : NSView
{
    CameraContentViewImpl* impl; 
}
- (void)updateFrame:(const unsigned char*)data width:(int)width height:(int)height;
- (void)updateContour:(const std::vector<std::pair<int, int>>&)points;
- (void)updateText:(const std::string&)text
                x:(int)x
                y:(int)y
         fontSize:(int)fontSize
            color:(const CameraWindow::Color&)color;
@end

@implementation CameraContentView

- (instancetype)initWithFrame:(NSRect)frameRect {
    self = [super initWithFrame:frameRect];
    if (self) {
        impl = new CameraContentViewImpl();
    }
    return self;
}

- (void)dealloc {
    delete impl;
    [super dealloc];
}

- (void)updateFrame:(const unsigned char*)data width:(int)width height:(int)height {
    impl->rgbData.assign(data, data + (width * height * 3));
    impl->frameWidth = width;
    impl->frameHeight = height;
    [self setNeedsDisplay:YES];
}

- (void)updateContour:(const std::vector<std::pair<int, int>>&)points {
    impl->contourPoints = points;
    [self setNeedsDisplay:YES];
}

- (void)updateText:(const std::string&)text
                x:(int)x
                y:(int)y
         fontSize:(int)fontSize
            color:(const CameraWindow::Color&)color {
    impl->displayText = text;
    impl->textColor = color;
    impl->x = x;
    impl->y = y;
    impl->fontSize = fontSize;
    impl->textColor = color;
    [self setNeedsDisplay:YES];
}

- (void)drawRect:(NSRect)dirtyRect {
    [super drawRect:dirtyRect];

    NSRect bounds = [self bounds];
    CGContextRef context = [[NSGraphicsContext currentContext] CGContext];
    if (impl->rgbData.empty() || impl->frameWidth == 0 || impl->frameHeight == 0) {
        return;
    }

    CGFloat scaleX = bounds.size.width / impl->frameWidth;
    CGFloat scaleY = bounds.size.height / impl->frameHeight;

    CGFloat scale = MIN(scaleX, scaleY);

    CGFloat offsetX = (bounds.size.width - (impl->frameWidth * scale)) / 2.0;
    CGFloat offsetY = (bounds.size.height - (impl->frameHeight * scale)) / 2.0;

    CGContextSaveGState(context); 

    CGContextTranslateCTM(context, offsetX, offsetY);
    CGContextScaleCTM(context, scale, scale);

    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    CGDataProviderRef provider = CGDataProviderCreateWithData(NULL, impl->rgbData.data(), impl->rgbData.size(), NULL);
    CGImageRef image = CGImageCreate(impl->frameWidth, impl->frameHeight, 8, 24, impl->frameWidth * 3, colorSpace, kCGBitmapByteOrderDefault | kCGImageAlphaNone, provider, NULL, false, kCGRenderingIntentDefault);

    CGRect rect = CGRectMake(0, 0, impl->frameWidth, impl->frameHeight);
    CGContextDrawImage(context, rect, image);

    CGImageRelease(image);
    CGDataProviderRelease(provider);
    CGColorSpaceRelease(colorSpace);

    if (!impl->contourPoints.empty()) {
        CGContextSaveGState(context); 

        CGContextSetLineWidth(context, 3.0 / scale); 
        CGContextSetStrokeColorWithColor(context, [[NSColor yellowColor] CGColor]);

        auto firstPoint = impl->contourPoints[0];
        CGContextMoveToPoint(context, firstPoint.first, impl->frameHeight - firstPoint.second);

        for (size_t i = 1; i < impl->contourPoints.size(); ++i) {
            auto point = impl->contourPoints[i];
            CGContextAddLineToPoint(context, point.first, impl->frameHeight - point.second);
        }

        CGContextClosePath(context);
        CGContextStrokePath(context);

        CGContextRestoreGState(context); 

        impl->contourPoints.clear(); 
    }

    CGContextRestoreGState(context); 

    if (!impl->displayText.empty()) {
        CGContextSaveGState(context); 
    
        CGFloat scaledX = impl->x * scale + offsetX;
        CGFloat scaledY = impl->y * scale + offsetY;

        NSColor *color = [NSColor colorWithRed:impl->textColor.r / 255.0 green:impl->textColor.g / 255.0 blue:impl->textColor.b / 255.0 alpha:1.0];

        NSDictionary *attributes = @{
            NSFontAttributeName : [NSFont systemFontOfSize:impl->fontSize * scale],
            NSForegroundColorAttributeName : color
        };

        NSPoint point = NSMakePoint(scaledX, bounds.size.height - scaledY - (impl->fontSize * scale));
        NSString *nsText = [NSString stringWithUTF8String:impl->displayText.c_str()];
        [nsText drawAtPoint:point withAttributes:attributes];

        CGContextRestoreGState(context);

        impl->displayText.clear(); 
    }
}

@end

Explanation

  • CameraContentViewImpl holds all the data required for rendering (frame data, text, etc.).
  • CameraContentView is an NSView subclass that draws the camera frame, contours, and text.
  • Calling setNeedsDisplay: triggers a redraw, which in turn calls drawRect.
  • In drawRect:, the scaling factors and offsets are calculated to maintain aspect ratio and center the image.

Handle Window Close Events

Create a custom NSWindowDelegate to handle window events:

@interface CameraWindowDelegate : NSObject <NSWindowDelegate>
@end

@implementation CameraWindowDelegate
- (BOOL)windowShouldClose:(id)sender {
    [NSApp terminate:nil];
    return YES;
}
@end

Initialize the NSWindow and Content View

Initialize the window and content view, and store them in nsWindow and contentView:

bool CameraWindow::Create() {
    @autoreleasepool {
        if (NSApp == nil) {
            [NSApplication sharedApplication];
            [NSApp setActivationPolicy:NSApplicationActivationPolicyRegular];
            [NSApp finishLaunching];
        }

        NSRect contentRect = NSMakeRect(100, 100, width, height);
        NSUInteger styleMask = NSWindowStyleMaskTitled | NSWindowStyleMaskClosable | NSWindowStyleMaskResizable;
        NSWindow *window = [[NSWindow alloc] initWithContentRect:contentRect
                                                       styleMask:styleMask
                                                         backing:NSBackingStoreBuffered
                                                           defer:NO];
        if (!window) {
            return false;
        }

        [window setTitle:[NSString stringWithUTF8String:title.c_str()]];
        [window makeKeyAndOrderFront:nil];

        CameraContentView *cv = [[CameraContentView alloc] initWithFrame:contentRect];
        [window setContentView:cv];
        contentView = cv;

        CameraWindowDelegate *delegate = [[CameraWindowDelegate alloc] init];
        [window setDelegate:delegate];

        nsWindow = (void *)window;
        return true;
    }
}

Display the Window

Bring the application to the front and make it active:

void CameraWindow::Show() {
    @autoreleasepool {
        [NSApp activateIgnoringOtherApps:YES];
    }
}

Listen for Keyboard Input

Capture keyboard input events:

bool CameraWindow::WaitKey(char key)
{
    @autoreleasepool {
        NSEvent *event = [NSApp nextEventMatchingMask:NSEventMaskAny
                                           untilDate:[NSDate distantPast]
                                              inMode:NSDefaultRunLoopMode
                                             dequeue:YES];
        if (event) {
            [NSApp sendEvent:event];

            if (event.type == NSEventTypeKeyDown) {
                NSString *characters = [event charactersIgnoringModifiers];
                if ([characters length] > 0) {
                    char pressedKey = [characters characterAtIndex:0];
                    if (key == '\0' || pressedKey == key || pressedKey == std::toupper(key)) {
                        return false;  
                    }
                }
            }
        }
        return true;
    }
}

Draw the Camera Frame, Contours, and Text

Update the camera frame, contours, and text as follows:

void CameraWindow::ShowFrame(const unsigned char *rgbData, int frameWidth, int frameHeight) {
    if (contentView) {
        [contentView updateFrame:rgbData width:frameWidth height:frameHeight];
    }
}

void CameraWindow::DrawContour(const std::vector<std::pair<int, int>> &points) {
    if (contentView) {
        [contentView updateContour:points];
    }
}

void CameraWindow::DrawText(const std::string &text, int x, int y, int fontSize, const CameraWindow::Color &color) {
    if (contentView) {
        [contentView updateText:text x:x y:y fontSize:fontSize color:color];
    }
}

Configure CMakeLists.txt for macOS Frameworks

To build the library on macOS, update the CMakeLists.txt:

...

if (WIN32)
    ...
elseif (UNIX AND NOT APPLE)
    ...
elseif (APPLE)
    set(LIBRARY_SOURCES
        src/CameraMacOS.mm
        src/CameraPreviewMacOS.mm
    )
    set_source_files_properties(src/CameraMacOS.mm src/CameraPreviewMacOS.mm PROPERTIES COMPILE_FLAGS "-x objective-c++")
    set_source_files_properties(src/main.cpp PROPERTIES COMPILE_FLAGS "-x objective-c++")
endif()
...

if (UNIX AND NOT APPLE)
    ...
elseif (APPLE)
    find_library(COCOA_LIBRARY Cocoa REQUIRED)
    find_library(AVFOUNDATION_LIBRARY AVFoundation REQUIRED)
    find_library(COREMEDIA_LIBRARY CoreMedia REQUIRED)
    find_library(COREVIDEO_LIBRARY CoreVideo REQUIRED)
    find_library(OBJC_LIBRARY objc REQUIRED)  # Add the Objective-C runtime library

    target_link_libraries(litecam PRIVATE 
        ${COCOA_LIBRARY} 
        ${AVFOUNDATION_LIBRARY} 
        ${COREMEDIA_LIBRARY} 
        ${COREVIDEO_LIBRARY} 
        ${OBJC_LIBRARY}  # Link the Objective-C runtime
    )
elseif (WIN32)
    ...
endif()
...

if (APPLE)
    target_link_libraries(camera_capture PRIVATE 
        ${COCOA_LIBRARY} 
        ${AVFOUNDATION_LIBRARY} 
        ${COREMEDIA_LIBRARY} 
        ${COREVIDEO_LIBRARY} 
        ${OBJC_LIBRARY}  # Link the Objective-C runtime
    )
endif()

Explanation

  • -x objective-c++ ensures source files are compiled as Objective-C++.
  • On macOS, link against the Cocoa framework, AVFoundation, and the Objective-C runtime.

Build and Run the macOS Barcode Scanner

To build a barcode scanner, no changes are needed for the barcode scanning logic. Follow these steps:

  1. Update the CMakeLists.txt file to include the macOS-specific configuration.

     ...
        
     if(WIN32)
        
         ...
        
     elseif(APPLE)
         set(CMAKE_CXX_FLAGS "-std=c++11 -O3 -Wl,-rpath,@executable_path")
         set(CMAKE_INSTALL_RPATH "@executable_path")
        
         link_directories(
             ${CMAKE_CURRENT_SOURCE_DIR}/../../dist/lib/macos
             ${CMAKE_CURRENT_SOURCE_DIR}/../../../examples/10.x/sdk/platforms/macos
         )
        
         set(DBR_LIBS
             "DynamsoftCore"
             "DynamsoftLicense"
             "DynamsoftCaptureVisionRouter"
             "DynamsoftUtility"
             "pthread"
         )
     elseif(UNIX)
         ...
     endif()
        
     ...
        
     if(WIN32)
         ...
     elseif(APPLE)
         add_custom_command(TARGET BarcodeScanner POST_BUILD
             COMMAND ${CMAKE_COMMAND} -E copy_directory
             ${CMAKE_CURRENT_SOURCE_DIR}/../../../examples/10.x/sdk/platforms/macos
             $<TARGET_FILE_DIR:BarcodeScanner>
         )
     elseif(UNIX)
         ...
     endif()
        
        
    
  2. Build the application using CMake.

     mkdir build
     cd build
     cmake ..
     cmake --build .
    

    macOS Barcode Scanner

Common Issues and Edge Cases

  • Camera permission denied: macOS requires a camera usage description in your Info.plist. If your app does not include NSCameraUsageDescription, the system will silently deny camera access. Add <key>NSCameraUsageDescription</key><string>This app needs camera access to scan barcodes.</string> to your Info.plist.
  • Black frames on Apple Silicon: Some M-series Macs deliver the first few frames as blank buffers. Skip frames where CVPixelBufferGetBaseAddress returns all zeros, or add a short warm-up delay after calling [session startRunning].
  • Linking errors for Objective-C runtime: If you get Undefined symbols for architecture errors referencing objc_msgSend, ensure you link the objc runtime library (find_library(OBJC_LIBRARY objc REQUIRED)) and pass -x objective-c++ to the compiler for .mm files.

FAQ

How do I access the macOS camera from C++ without OpenCV?

Use AVFoundation via Objective-C++ (.mm files). Create an AVCaptureSession, attach an AVCaptureDeviceInput for the camera, and route output through AVCaptureVideoDataOutput to receive raw BGRA frames in a C++ callback.

What pixel format does AVFoundation deliver, and how do I convert it for barcode decoding?

AVFoundation delivers frames in BGRA by default when you set kCVPixelFormatType_32BGRA. Convert to RGB by swapping the blue and red channels (offset +2 → R, offset +1 → G, offset +0 → B) in a per-pixel loop.

Why does my CMake build fail with “Undefined symbols” on macOS?

You need to link the Cocoa, AVFoundation, CoreMedia, CoreVideo, and objc frameworks. Also ensure .mm source files have the -x objective-c++ compile flag set via set_source_files_properties in CMakeLists.txt.

Source Code

https://github.com/yushulx/cmake-cpp-barcode-qrcode-mrz/tree/main/litecam