Build a React Native Document Scanner

In the previous article, we’ve built a document normalization React Native Vision Camera frame processor based on Dynamsoft Document Normalizer SDK. In this article, we are going to use it to build a React Native document scanning demo app.

Preview of the final result:

The app can automatically detect the document and run perspective transformation to get the document image. The image can be saved in several color modes: binary, gray and color.

Build a React Native Document Scanner

Let’s do this in steps.

New Project

Create a new React Native project with TypeScript support.

npx react-native init DocumentScanner --template react-native-template-typescript

Add Dependencies

  • Install the following packages to use the camera with the document normalization frame processor plugin:

    npm install vision-camera-dynamsoft-document-normalizer react-native-vision-camer react-native-reanimated
    

    We also need to update babel.config.js file for the frame processor plugin:

     module.exports = {
       presets: ['module:metro-react-native-babel-preset'],
    +  plugins: [
    +    [
    +      'react-native-reanimated/plugin',
    +      {
    +        globals: ['__detect'],
    +      },
    +    ],
    +  ]
     };
    
  • Install @react-navigation/native and dependent packages to add navigation.

    npm install @react-navigation/native @react-navigation/native-stack react-native-safe-area-context react-native-screens
    
  • Install react-native-svg to draw the overlay for detect documents.

    npm install react-native-svg
    
  • Install react-native-share to share the normalized image.

    npm install react-native-share
    
  • Install react-native-simple-radio-button to provide a radio button component.

    npm install react-native-simple-radio-button
    

Add Camera Permission

For Android, add the following to AndroidManifest.xml.

<uses-permission android:name="android.permission.CAMERA" />

For iOS, add the following to Info.plist.

<key>NSCameraUsageDescription</key>
<string>For document scanning</string>
  1. Create a src folder and move App.tsx into it. Replace its content with the following:

    import * as React from 'react';
    import { NavigationContainer } from '@react-navigation/native';
    import { createNativeStackNavigator } from '@react-navigation/native-stack';
    import ScannerScreen from './screens/Scanner';
    import HomeScreen from './screens/Home';
    import ResultViewerScreen from './screens/ResultViewer';
    
    const Stack = createNativeStackNavigator();
    
    export default function App() {
      return (
        <NavigationContainer>
          <Stack.Navigator>
            <Stack.Screen name="Home" component={HomeScreen} />
            <Stack.Screen name="Scanner" component={ScannerScreen} />
            <Stack.Screen name="ResultViewer" component={ResultViewerScreen} />
          </Stack.Navigator>
        </NavigationContainer>
      );
    }
    
  2. Create a screen folder with the following files:

    screens\Home.tsx
    screens\ResultViewer.tsx
    screens\Scanner.tsx
    

    A template of the files looks like this:

    import React from "react";
    import { SafeAreaView, StyleSheet } from "react-native";
    
    export default function HomeScreen({route, navigation}) {
      return (
        <SafeAreaView style={styles.container}>
        </SafeAreaView>
      );
    }
    
    const styles = StyleSheet.create({
      container: {
        flex:1,
      },
    });
    

Next, we are going to implement the three files.

Home Page

In the home page, add a button to navigate to the scanner page.

export default function HomeScreen({route, navigation}) {
  const onPressed = () => {
    navigation.navigate(
      {
        name: "Scanner"
      }
    );
  }

  return (
    <SafeAreaView style={styles.container}>
      <TouchableOpacity
        style={styles.button}
        onPress={() => onPressed()}
      >
        <Text style={styles.buttonText}>Scan Document</Text>
      </TouchableOpacity>
    </SafeAreaView>
  );
}

Scanner Page

  1. In the scanner page, add a camera component first to test if the camera works.

    export default function ScannerScreen({route, navigation}) {
      const camera = useRef<Camera>(null)
      const [hasPermission, setHasPermission] = useState(false);
      const devices = useCameraDevices();
      const device = devices.back;
      useEffect(() => {
        (async () => {
          const status = await Camera.requestCameraPermission();
          setHasPermission(status === 'authorized');
        })();
      }, []);
    
      return (
         <SafeAreaView style={styles.container}>
           {device != null &&
           hasPermission && (
           <>
               <Camera
                 style={StyleSheet.absoluteFill}
                 ref={camera}
                 isActive={true}
                 device={device}
               />
           </>)}
         </SafeAreaView>
      );
    }
    
    const styles = StyleSheet.create({
      container: {
        flex: 1
      },
    });
    
  2. Use the document normalization frame processor to detect documents.

    Define the frame processor:

    const frameProcessor = useFrameProcessor((frame) => {
      'worklet'
      const results = DDN.detect(frame);
      console.log(results);
    }, [])
    

    Then pass it to the camera component’s props.

    <Camera
      style={StyleSheet.absoluteFill}
      ref={camera}
      isActive={true}
      device={device}
      frameProcessor={frameProcessor}
      frameProcessorFps={5}
    />
    

    Remember to init the license for Dynamsoft Document Normalizer (apply for a trial license).

    let result = await DDN.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==");
    
  3. Draw the overlay for the detected document.

    1. Add an SVG component below the camera component.

      <Svg preserveAspectRatio='xMidYMid slice' style={StyleSheet.absoluteFill} viewBox={viewBox.value}>
        <Polygon
          points={pointsData.value}
          fill="lime"
          stroke="green"
          opacity="0.5"
          strokeWidth="1"
        />
      </Svg>
      
    2. Create several shared values so that we can pass values from the frame processor to the main app.

      const detectionResults = REA.useSharedValue([] as DetectedQuadResult[]);
      const frameWidth = REA.useSharedValue(0);
      const frameHeight = REA.useSharedValue(0);
      const frameProcessor = useFrameProcessor((frame) => {
        'worklet'
        console.log(frame);
        if (taken.value === false) {
          const results = DDN.detect(frame);
          console.log(results);
          frameWidth.value = frame.width;
          frameHeight.value = frame.height;
          detectionResults.value = results;
        }
      }, [])
      
    3. Define several derived values which change after other shared values change.

      viewBox for the SVG’s viewBox attribute:

      const viewBox = REA.useDerivedValue(() => {
        let viewBox = "";
        let rotated = false;
        if (platform.value === "android") {
          if (!(frameWidth.value>frameHeight.value && screenWidth.value>screenHeight.value)){
            rotated = true;
          }
        }
        if (rotated) {  //The frame on Android may be rotated. We need to switch the width and height.
          viewBox = "0 0 "+frameHeight.value+" "+frameWidth.value;
        }else{
          viewBox = "0 0 "+frameWidth.value+" "+frameHeight.value;
        }
        return viewBox;
      }, [frameWidth,frameHeight]);
      

      pointsData which is used to represent the polygon SVG element’s location:

      const [pointsText, setPointsText] = useState("default");
      const pointsData = REA.useDerivedValue(() => {
        console.log("update pointsData");
        let data = "";
        if (detectionResults.value.length>0) {
          let result = detectionResults.value[0];
          if (result) {
            let location = result.location;
            let pointsData = location.points[0].x + "," + location.points[0].y + " ";
            pointsData = pointsData + location.points[1].x + "," + location.points[1].y +" ";
            pointsData = pointsData + location.points[2].x + "," + location.points[2].y +" ";
            pointsData = pointsData + location.points[3].x + "," + location.points[3].y;
            data = pointsData;
          }
        }
        REA.runOnJS(setPointsText)(data); //update the state to rerender the component
        return data;
      }, [detectionResults]);
      
  4. Take a photo if the IoUs (Intersection over Union) of three consecutive polygons are over 90%.

    const previousResults = useRef([] as DetectedQuadResult[]);
    useEffect(() => { //trigger checking of the IoUs when a new polygon is detected.
      checkIfSteady();
    }, [pointsText]);
       
    const checkIfSteady = async () => {
      let result = detectionResults.value[0];
      if (result) {
        if (previousResults.current.length >= 3) {
          if (steady() == true) {
            await takePhoto();
            console.log("steady");
          }else{
            console.log("shift and add result");
            previousResults.current.shift();
            previousResults.current.push(result);
          }
        }else{
          console.log("add result");
          previousResults.current.push(result);
        }
      }
    }
       
    const steady = () => {
      if (previousResults.current[0] && previousResults.current[1] && previousResults.current[2]) {
        let iou1 = intersectionOverUnion(previousResults.current[0].location.points,previousResults.current[1].location.points);
        let iou2 = intersectionOverUnion(previousResults.current[1].location.points,previousResults.current[2].location.points);
        let iou3 = intersectionOverUnion(previousResults.current[2].location.points,previousResults.current[1].location.points);
        if (iou1>0.9 && iou2>0.9 && iou3>0.9) {
          return true;
        }else{
          return false;
        }
      }
      return false;
    }
    

    The takePhoto function is the following. Because the photo taken may have a higher resolution and be rotated, we need to calculate the width ratio and height ratio to fit the detected location from the camera preview to the photo taken.

    const [photoPath, setPhotoPath] = useState<undefined|string>(undefined);
    const widthRatio = useRef(1);
    const heightRatio = useRef(1);
    const takePhoto = async () => {
      console.log("take photo");
      if (camera.current) {
        taken.value = true;
        const photo = await camera.current.takePhoto();
        let rotated = false;
        if (platform.value === "android") {
          if (!(frameWidth.value>frameHeight.value && screenWidth.value>screenHeight.value)){
            rotated = true;
          }
          if (rotated) {
            widthRatio.current = frameHeight.value/photo.width;
            heightRatio.current = frameWidth.value/photo.height;
          } else {
            widthRatio.current = frameWidth.value/photo.width;
            heightRatio.current = frameHeight.value/photo.height;
          }
        } else {
          let photoRotated = false;
          if (!(photo.width>photo.height && screenWidth.value>screenHeight.value)){
            photoRotated = true;
          }
          if (photoRotated) {
            widthRatio.current = frameWidth.value/photo.height;
            heightRatio.current = frameHeight.value/photo.width;
          }else{
            widthRatio.current = frameWidth.value/photo.width;
            heightRatio.current = frameHeight.value/photo.height;
          }
        }
        setPhotoPath(photo.path);
      }
    }
    

    A Utils.tsx is created to store the functions calculating the IoUs between polygons.

    import type { Point, Rect } from "vision-camera-dynamsoft-document-normalizer";
    
    export function intersectionOverUnion(pts1:Point[] ,pts2:Point[]) : number {
      let rect1 = getRectFromPoints(pts1);
      let rect2 = getRectFromPoints(pts2);
      return rectIntersectionOverUnion(rect1, rect2);
    }
    
    function rectIntersectionOverUnion(rect1:Rect, rect2:Rect) : number {
      let leftColumnMax = Math.max(rect1.left, rect2.left);
      let rightColumnMin = Math.min(rect1.right,rect2.right);
      let upRowMax = Math.max(rect1.top, rect2.top);
      let downRowMin = Math.min(rect1.bottom,rect2.bottom);
    
      if (leftColumnMax>=rightColumnMin || downRowMin<=upRowMax){
        return 0;
      }
    
      let s1 = rect1.width*rect1.height;
      let s2 = rect2.width*rect2.height;
      let sCross = (downRowMin-upRowMax)*(rightColumnMin-leftColumnMax);
      return sCross/(s1+s2-sCross);
    }
    
    function getRectFromPoints(points:Point[]) : Rect {
      if (points[0]) {
        let left:number;
        let top:number;
        let right:number;
        let bottom:number;
           
        left = points[0].x;
        top = points[0].y;
        right = 0;
        bottom = 0;
    
        points.forEach(point => {
          left = Math.min(point.x,left);
          top = Math.min(point.y,top);
          right = Math.max(point.x,right);
          bottom = Math.max(point.y,bottom);
        });
    
        let r:Rect = {
          left: left,
          top: top,
          right: right,
          bottom: bottom,
          width: right - left,
          height: bottom - top
        };
           
        return r;
      }else{
        throw new Error("Invalid number of points");
      }
    }
    
  5. Display the photo taken with an Image component and two buttons to let the user choose whether to retake the photo or move on to the result viewer.

    {photoPath && (
      <>
        <View style={styles.control}>
          <View style=>
            <TouchableOpacity onPress={retake} style={styles.button}>
              <Text style=>Retake</Text>
            </TouchableOpacity>
          </View>
          <View style=>
            <TouchableOpacity onPress={okay} style={styles.button}>
              <Text style=>Okay</Text>
            </TouchableOpacity>
          </View>
        </View>
      </>
    )}
    

    Relevant functions:

    const retake = () => {
      detectionResults.value = [];
      previousResults.current = [];
      setPhotoPath(undefined);
      taken.value = false;
    }
    
    const okay = () => {
      console.log("okay");
      let result = detectionResults.value[0];
      if (result) {
        result = scaleDetectionResult(result);
      }
      navigation.navigate(
        {
          params: {photoPath:photoPath, detectionResult:result},
          name: "ResultViewer"
        }
      );
    }
       
    const scaleDetectionResult = (result:DetectedQuadResult):DetectedQuadResult =>  {
      let points:[Point,Point,Point,Point] = [{x:0,y:0},{x:0,y:0},{x:0,y:0},{x:0,y:0}];
      for (let index = 0; index < result.location.points.length; index++) {
        const point = result.location.points[index];
        if (point) {
          let newPoint:Point = {
            x:point.x / widthRatio.current,
            y:point.y / heightRatio.current
          };
          points[index] = newPoint;
        }
      }
      let quad:Quadrilateral = {
        points: points
      };
      let newQuadResult:DetectedQuadResult = {
        confidenceAsDocumentBoundary:result.confidenceAsDocumentBoundary,
        location: quad
      };
      return newQuadResult;
    }
    

Result Viewer Page

The result viewer page uses an Image component to display the normalized image and three radio buttons to select which color mode to use for the normalization. There is also a share button to share the normalized image.

const radio_props = [
  {label: 'Binary', value: 0 },
  {label: 'Gray', value: 1 },
  {label: 'Color', value: 2 }
];

let normalizedResult:any = {};

export default function ResultViewerScreen({route, navigation}) {
  const [normalizedImagePath, setNormalizedImagePath] = useState<undefined|string>(undefined);

  useEffect(() => {
    normalizedResult = {};
    normalize(0);
  }, []);

  const share = () => {
    console.log("save");
    let options:ShareOptions = {};
    options.url = "file://"+normalizedImagePath;
    Share.open(options);
  }

  const normalize = async (value:number) => {
    console.log(value);
    if (normalizedResult[value]) {
      setNormalizedImagePath(normalizedResult[value]);
    }else{
      if (value === 0) {
        await DDN.initRuntimeSettingsFromString("{\"GlobalParameter\":{\"Name\":\"GP\",\"MaxTotalImageDimension\":0},\"ImageParameterArray\":[{\"Name\":\"IP-1\",\"NormalizerParameterName\":\"NP-1\",\"BaseImageParameterName\":\"\"}],\"NormalizerParameterArray\":[{\"Name\":\"NP-1\",\"ContentType\":\"CT_DOCUMENT\",\"ColourMode\":\"ICM_BINARY\"}]}");
      } else if (value === 1) {
        await DDN.initRuntimeSettingsFromString("{\"GlobalParameter\":{\"Name\":\"GP\",\"MaxTotalImageDimension\":0},\"ImageParameterArray\":[{\"Name\":\"IP-1\",\"NormalizerParameterName\":\"NP-1\",\"BaseImageParameterName\":\"\"}],\"NormalizerParameterArray\":[{\"Name\":\"NP-1\",\"ContentType\":\"CT_DOCUMENT\",\"ColourMode\":\"ICM_GRAYSCALE\"}]}");
      } else {
        await DDN.initRuntimeSettingsFromString("{\"GlobalParameter\":{\"Name\":\"GP\",\"MaxTotalImageDimension\":0},\"ImageParameterArray\":[{\"Name\":\"IP-1\",\"NormalizerParameterName\":\"NP-1\",\"BaseImageParameterName\":\"\"}],\"NormalizerParameterArray\":[{\"Name\":\"NP-1\",\"ContentType\":\"CT_DOCUMENT\",\"ColourMode\":\"ICM_COLOUR\"}]}");
      }
      console.log("update settings done");
      let detectionResult:DetectedQuadResult = route.params.detectionResult;
      let photoPath = route.params.photoPath;
      let normalizedImageResult = await DDN.normalizeFile(photoPath, detectionResult.location,{saveNormalizationResultAsFile:true});
      console.log(normalizedImageResult);
      if (normalizedImageResult.imageURL) {
        normalizedResult[value] = normalizedImageResult.imageURL;
        setNormalizedImagePath(normalizedImageResult.imageURL)
      }
    }
  }

  return (
    <SafeAreaView style={styles.container}>
      {normalizedImagePath && (
        <Image
          style={[StyleSheet.absoluteFill,styles.image]}
          source=
        />
      )}
      <View style={styles.control}>
        <View style={styles.buttonContainer}>
          <TouchableOpacity onPress={share} style={styles.button}>
            <Text style=>Share</Text>
          </TouchableOpacity>
        </View>
        <View style={styles.radioContainer}>
          <RadioForm
            radio_props={radio_props}
            initial={0}
            formHorizontal={true}
            labelHorizontal={false}
            
            onPress={(value) => {normalize(value)}}
          />
        </View>
        
      </View>
    </SafeAreaView>
  );
}

const styles = StyleSheet.create({
  container: {
    flex:1,
  },
  control:{
    flexDirection:"row",
    position: 'absolute',
    bottom: 0,
    height: "15%",
    width:"100%",
    alignSelf:"flex-start",
    alignItems: 'center',
  },
  radioContainer:{
    flex: 0.7,
    padding: 5,
    margin: 3,
  },
  buttonContainer:{
    flex: 0.3,
    padding: 5,
    margin: 3,
  },
  button: {
    backgroundColor: "ghostwhite",
    borderColor:"black", 
    borderWidth:2, 
    borderRadius:5,
    padding: 8,
    margin: 3,
  },
  image: {
    resizeMode:"contain",
  }
});

Source Code

We’ve now completed the demo. Get the source code and have a try: https://github.com/tony-xlh/react-native-document-scanner