5.1 Introduction
5.2.1 Problem Analysis: Frame Rate Requirements
Temporal super resolution is done obtain a certain frame rate, as required by the receiver of the output sequence. In most cases the need is for a higher frame rate. We will focus on the frame rate requirements of humans viewers as our TSR algorithm is aimed at application in video processors in broadcast or home entertainment systems where pleasing human viewers is the final goal. The properties of the human visual system guides what minimum frame rates should be used to keep the viewing experience pleasing. The two main requirements put forth are listed below in order of importance.
with natural motion portrayal.
• Flickering should be avoided when displaying image sequences.
Before we look a the two requirements, we would like to stress the point that determining the exact minimum required frame rate of image sequences is a dif- ficult, multiparameter problem. The applied display technology (film projectors in cinemas, LCD tv set, etc.), screen size, screen brightness, contrast and color reproduction, viewing distance and angle, lightning conditions in the viewing room, image sequence content, motion in the scene depicted, and aperture time of cameras, are but the main factors in the equation of defining the minimum frame rate required.
There are however some general (standardized) agreements of which frame rates to use in different media to meet the three requirements posed above. Since sometime around the 1920s almost all film recordings have been done in 24 fps [7]. The American and Asian tv standard NTSC requires 60 interlaced fields per second (also abbreviated fps), while the European PAL requires 50 fields per second. There was also a growing consensus up through the 1990’s that 50 fps was no longer enough to avoid flickering as the size and brightness PAL CRT tv-sets grew too large. This gave birth to the 100 Hz technology, an example of early TSR (see for instance [32]).
The phi-effect was already discussed in Chapter 2 of this thesis, but to recap it is the effect of showing a set of still images, recorded and shown so fast after each other, that any motion in the depicted scene appears real and natural. To create the illusion of motion pictures, each frame cannot be exposed to the eye for too long as the human visual system will interpolate the simplest (linear) motion between the frames erasing any complexity in it [73]. Thus the frames need to be recorded at a rather high rate to obtain the phi-effect – at least when motion is complex.
Flickering occurs on when an image is not updated often enough, that is the update frequency (frame refresh rate) is so low that the eye senses flicker. Thus in cinemas a rotating shutter blinds out each frame shortly once or twice to get 48 or 72 Hz refresh rates. Since the eye is very sensitive to changes in the light it is exposed to – especially in the periphery of the retina away from the fovea – and since cinema screens and many larger television screens expose a large part of the eye, the flicker can be sensed at frame rates higher than what is traditionally required to obtain the phi-effect. Although one might not consciously sense any flicker it is still sensed subconsciously tiring the visual system.1 There seems to be a consensus that refresh rates somewhere around 70
to 100 HZ will suffice to stop most flickering on most displays, but it ultimately depends on the tracking done by the eye of the human observer.2
1By looking at something next to a CRT screen without really focussing on it, the screen
will now be projected to the more flicker sensitive parts of the retina and if the refresh rate is set to 60 Hz or less one will clearly see the flicker normally only sensed subconsciously as the HVS is busy with what is in focus at the fovea.
2While film projectors needs to change frames and CRTs has the problem of the light
emission of the phosphor used on the screen drops over time, LCD monitors are not subject to forced blackouts between frames: they are hold-type image displays, not impulse-type image displays see [20]. Also plasma displays have some memory in the plasma gas to partially prevent flicker problems. Even with the flicker problem (almost) eliminated, LCDs and plasma are still subject to the need of creating the phi-effect when displaying video and thus needs higher frames rates when displaying motion.
As our discussion of the two requirements have shown, the major reason why we do temporal super resolution is to obtain the phi-effect. Flicker can be handled by simple frame repetition, but since the eye is able to track motion to a certain degree, the frequency used for sampling the signal is too low to establish the phi-effect and the motion seems unnatural: As the eye track the motion, the sampling frequency is too low and the eye senses flickering and no natural motion between the two samples can be inferred by the HVS to get the phi-effect. Thus our two requirements are closely related, with the difficult task being to prevent flicker in tracked motions to reestablish the phi-effect.
The problem of unnatural motion is most prominent when the boundary of an object moving (or some internal edge of the object) is of high contrast (large gradient). If the recorded frame rate is too low, the phi-effect is not obtained and the viewer is left with an sensation of a jerky, unnatural motion. Along with the contrast of the edge the velocity of the motion can break the phi-effect. The faster an object (or the camera) moves the more jerky the apparent motion becomes until the eye is no longer able to track the motion.
Decreasing the apparent jerkiness of fast moving, high contrast edges will be our focus when developing and testing our motion compensated temporal super resolution algorithm.
An effect that might cover up the jerky motion appearing when using too low frame rates, is motion blur. Motion blur occurs (in regions of motion) when the camera used has a long aperture time and the input in each point or pixel of the frame is integrated over a relatively long time interval.