As seen in Figure 3-2, Paper Dreams generates an Canvas Generated Image of what is currently on the canvas to pass to the Flask server for processing. However, the Canvas Generated Image need to be either exactly 512 by 512 pixels, or 256 by 256 pixels, for the trained machine learning models; the canvas itself is usually around 2000 pixels large. Because the user can draw at any size, the image from the canvas needs to be scaled and cropped appropriately to fit into the 512 by 512 pixel image. In addition, because the user does not necessarily draw in the center of the canvas,
Figure 3-8: The user can click on any of the related terms, pull a series of images from our database, and add any image from that series to the canvas by drawing a circle in the location and size they would like the image to be.
the image needs to be re-centered in the Canvas Generated Image as well.
Paper Dreams originally grabbed the image from the canvas directly by calculating the bounds based on the size of the image, and then cropping the appropriate part of the canvas. However, there was an unexpected issue with this approach- the resulting image did not resemble the images on which the sketch recognition was trained. The Paper Dreams canvas returned images with di↵erent line widths (see Figure 3-9) because the line widths were also scaled along with the image itself; however, the sketch recognition training set always had constant line widths (Figure 3-11).
I resolved this issue with an SVG-inspired approach- I saved the paths being drawn, and redrew them onto a constant 512 pixel by 512 pixel canvas as paths, which allowed me to set a custom line width on the 512 by 512 canvas. After analyzing the
Figure 3-9: Canvas Generated Image, using the raw image from the canvas. Because of the especially large di↵erence in scale, the final images vary in line thickness and pixelation.
training dataset for the sketch recognition, I calculated the correct line widths and produced the Canvas Generated Images in Figure 3-10. A comparison of the outputs from both methods to the images from the sketch recognition training set can be found in Figure 3-11.
Figure 3-10: Canvas Generated Image, using the path save method for constant line width. Despite the large di↵erence in scale, the final image has constant line width.
Figure 3-11: The Canvas Generated Image pulled from paths (bottom right) resembles the Sketchy Training set (top) more closely than the images pulled from the canvas (bottom left).
Chapter 4
Integration with Machine Learning
Sketch recognition has been a critical component of Paper Dreams from the conception of this project. By understanding what the user is drawing, the system gathers the contextual information that a↵ects the output of the other integrated machine learning models (as shown in Figure 4-1). Paper Dreams incorporates three routes of intelligent modeling: sketch recognition, style transfer based on the recognized object (adaptive texturization), and generating associated classes via natural language processing (NLP) similarity on the recognized object. The integration of these models is dependent on utilizing information from one model (such as a sketch recognition label) in another (such as texturization) to get an accurate output from the latter model.
4.1
Sketch Recognition
Paper Dreams currently performs sketch recognition on a completed user drawing. Our present recognition architecture is based on the deep learning network Sketch- a-net, which claims one of the highest accuracy rates on human sketches [22]. The Paper Dreams team trained the eight-layer convolutional neural net (CNN) from Sketch-A-Net on the 125 classes in the Sketchy Dataset, and found that we had a 75% accuracy rate across those 125 classes. However, while we can use our architecture for recognizing incomplete or “partial” sketches, the resulting labels are often incorrect
Figure 4-1: From an image of the current canvas, the system can generate the appro- priate label, texture, and associated classes. Note that the sketch recognition a↵ects the output of both other models.
until enough defining features are drawn.
Paper Dreams currently has no way of detecting when the user has drawn enough features for a positive sketch identification. The system temporarily resolves this issue by having the user click a button to call the recognition after they finish their sketch, but the Paper Dreams team is working on incorporating a more sophisticated partial recognition system based on DeepSketch 2 [23] that will allow us to predict what a user is drawing at various stages of completion, which allows for a full real-time im- plementation of sketch recognition. The partial sketch recognition uses information from the order and placement of the most recently drawn strokes to make its pre- dictions; for example, if the head of a horse is commonly the first body part drawn,
then the partial sketch recognition system will associate head-like strokes to a higher probability of being a horse.
In order to generate a training set for this partial sketch recognition architecture, I parsed the SVG images from the Sketchy and TU Berlin databases into a series of “progressive” partial sketches that represent the drawing at these various stages of completion (20%, 40%, 60%, 80%, 100%). Because SVGs are made up of a variety of path elements in the order they were drawn, I split the number of paths into five approximately equally spaced sections, and drew each section progressively onto each image. The final partial sketch dataset contains approximately 400,000 images from 186 classes, some of the results of which can be found in Figure 4-2.
Figure 4-2: A series of five generated progressive sketches, given an original SVG image, for the partial recognition.