NTSC,
PAL & Interlace Explained
The Motion Picture Camera & CinemaMotion picture cameras
are based on photographic film just like your everyday hand held photographic
camera. Hollywood movies use 35mm film but professional camera men often
use 16mm and the home enthusiast will usually be content with 8mm. To record
a movie, motion picture film is spun around a big reel inside a camera
and exposed 24 times a second. As a result it will capture 24 photographs,
or what we call 24 "frames" every second (fps). Each frame is one complete
photograph, it is not digitally stored or compressed - you could almost
literally cut each one out and stick it in your family photo album if you
wanted! Once the movie is made, the film is developed, placed onto a projector,
and projected onto the cinema screen.
Resolution
In terms of resolution its not really possible to compare a 35mm film
to a VHS or VGA resolution because, like any photographic film, its resolution
is based on a myriad tiny light sensitive crystals embedded into the film.
When these are struck by light they change colour to match the light that
has hit them producing a photo. But a 35mm film, based on average crystal
size would be about 5000 x 5000 pixels. This is also the resolution Photoshop
artists such as Craig Mullins use to create movie backdrops for the cinema.
Nevertheless, the human eye can barely see the equivalent of 3000 x 3000
pixels of such a small area. So when a 35mm movie is scanned into a computer
to try and get its full resolution for digital editing, it will be scanned
in at 4096 horizontal pixels, also known as 4K.
Television
Television, on the other hand, is a whole other ball of wax! As you
probably know, a TV screen is basically a empty glass box (or tube) with
all the air sucked out of it. Inside the front of this glass box it is
covered with a mesh of red, green and blue phosphor dots. At the back of
the tube it has three devices (called electron guns) that shoot three beam's
of electricity at these phosphor dots. When the electricity hits the dots
they glow and a colour picture is produced. Increase the beam strength
and you can brighten the amount of red, green or blue light produced at
any part of the screen. This, in effect, allows the colours to mix into
just about any colour and brightness imaginable. You might compare this
to mixing coloured paints together to form new colours. Whatever way you
look at it, this produces a colour picture that looks almost like real
life.
Interlace
Next is the important point! To produce a picture, these electric beam
are controlled by electromagnets to scan from side to side across the TV
screen (as illustrated in the picture below). The beam fly across the screen
in the same motion our eyes use when we are reading a book. They start
from the left, finish one line and then shoot back to start the next line.
When TV's were invented in the 1920s the type of phosphor used to produce
the colours did not respond very fast. This meant it was impossible to
get a picture in one shot; instead we would get a flickery strobing effect
moving down the screen! To solve this they decided that instead of putting
the lines on the screen one at a time (i.e. lines: 1, 2, 3, 4, 5) they
would put them on every other line in one pass (i.e. 1, 3, 5, 7, 9) and
then in-between the previous lines on the second pass (i.e. lines: 2, 4,
6, 8 etc.). This allowed a whole picture to be produced in two very fast
scans and allowed enough time for the slow phosphor dots to recover. This,
then, prevented any strobing effects from appearing - success! This process
is called interlacing!
Resolution
An analog TV's resolution refers to the number of horizontal lines displayed
on the screen. This is broken up into the active and non-active areas.
The non-active or blanked (A) area
is not used for the actual television picture and is basically always 'blanked'.
The extra signal information that would have been put here is often used
for closed captioning, synch info or other information such as VITC. But
obviously the bit we are interested in is the active part which refers
to where the actual picture will appear (B).
NTSC
The TV industry is dominated by two main standards for TV design: PAL
and NTSC. NTSC is one of my pet hates basically because of it's rather
low quality and use of weird framerates. NTSC stands for the National
Television
Systems Committee, it is the colour video standard used in North
America, Canada, Mexico and Japan. Some engineers have said it should stand
for Never Twice Same Color because no two NTSC
pictures look alike :). Due to the electric system used in the US it was
decided to scan the lines across the NTSC TV screen at about 60Hz (or 60
half frames per second) which produced 29.9700299700299700... frames every
second. NTSC resolution is about one sixth less than that of PAL - about
89 lines less. This may not seem so bad, but divide a sheet of paper into
six even parts and chop one off of the bottom and you will have a lot of
detail lost. NTSC uses 525 scan lines, number 1 through 525, of which only
485 make up the active picture.
Scan line 1-20 are the vertical blanking level for field 1. Note:
in NTSC (525 scan lines), “line 1” is defined as the first line in the
field blanking period of field 1. NTSC line counting is one-based,
not zero-based.
Scan line 21-263 are visible as field 1.
Scan line 264-282 are the vertical blanking level for field 2.
Scan line 283-525 are visible as field 2.
Scan line 263, the last scan line of field 1, is a half-line, which
only appears on the left side of the screen, and on that side is the bottom-most
scan line, below scan line 525.
Scan line 283, the first scan line of field 2, is a half-line, which
only appears on the right side of the screen, above scan line 21.
So depending if one counts scan lines 263 and 283 integrally (as I do),
or only as 0.5 of a scan line (as some people do), you either have 486
scan lines, or 485 scan lines.
Due to the variances in CRTs, some visible scan lines may be cropped
on less expensive CRT monitors.
(Added by John “Eljay” Love-Jensen)
PAL
PAL stands for Phase Alternating Line, it is the
TV standard used for Europe, Hong Kong and the Middle East. It was a new
standard based on the old NTSC system but designed to correct the NTSC
colour problems produced by phase errors in the transmission path. PAL
resolution is 625 horizontal lines but only 576 of these are used for the
picture. PAL is higher quality than NTSC, it keeps a sharper picture and
remains closer to the original format produced by motion picture cameras.
Due to the European electric standards it was decided to interlace PAL
lines every other line at 50Hz producing 25 whole frames every second.
TELECINE
This is the bit you've all been dying to read. Unfortunately I have
not written this with a bunch of amazing solutions in mind. The idea is
more to help you understand what is going on with your video so you can
decide how you will process it better.
Just so you don't get confused you should be clear on what the difference
is between a frame and field. A 'field' is basically every other scan line
of a picture. Two fields stuck together makes a single frame on a TV set!
In the picture below only one field is displayed on the left. Its hard
to see because only every other line is displayed. The picture on the right
is a whole frame. It is produced when we stick both fields together.
THIS IS ONE FIELD
|
THIS IS TWO FIELDS
|
(OR A HALF FRAME)
|
(OR A WHOLE FRAME)
|
|
|
Single fields that start from line 1 of the TV screen are called 'odd'
because they go in odd numbers (i.e. 1, 3, 5 etc.). Fields that start from
the second line to fill the gaps of the first are called 'even' because
they go in even numbers (i.e. 2, 4, 6 etc.). Fields that start from line
1 are more often also called "Top" fields because they start from the first
"top" line on the screen. Whereas single fields that start from the second
line down are called "Bottom" fields. Okay, now everything you read should
make perfect sense =)
TELECINE
As I have already mentioned, a motion picture camera captures its images
at 24 frames every second. Each frame is a full image. An NTSC television,
however, must play 30 frames per second, and these frames must be interlaced
into two fields both top and bottom! So basically what we are saying is
we must play 60 half frames (or fields) every second. The only way we are
going to be able to play a 24 fps motion picture on NTSC television is
to change it from 24 fps to 30 fps and interlace these frames into two
fields making 60 half frames per second. This transformation process is
done with a machine called a Telecine. A Telecine machine does something
called pulldown, which, in its simplest explanation, "pulls down" an extra
frame every fourth frame to make five whole frames instead of four!
3:2 Pulldown NTSC
3:2 pulldown is a name that confuses people basically because the term
"pulldown" is rather ambiguous - in other words, it's not really pulling
down anything! The process sounds complex but its really quite straightforward
and I have designed a picture to illustrate. The top row in the picture
below represents four frames from a motion picture camera. These are full
frames and not yet interlaced they are represented as A, B, C, D.
Now look at the second line in our picture below. The Telecine machine
takes the first whole frame A and splits it into three fields (stop reading
and take a look now). For the first field it uses the top field (T)
which means it takes lines 1, 3, 5 etc., from the original digitized picture.
The next field taken from A is the bottom field, so it will take lines
2, 4, 6 and so on. The third file we see labeled (Tr) is just a
copy of the first field again (so I labeled it T(r) to mean: top
repeated').
Now the Telecine machine goes onto the next frame B. This time it just
takes the top and bottom fields. Then we move on to the third frame C;
it splits it up into three fields, bottom (B) top (T) and
a repeat of the bottom one again (Br). Finally, the forth frame
D is split into the top and bottom fields. Thats it, that is all
a telecine machine does!
In short, this results in a field order of 3 fields,
2 fields, 3 fields, 2 fields! Or, if its easier to understand, our
picture above shows it as: 3 yellow, 2 green, 3 blue,
2 red.
So that is why it is called 3:2 pulldown, it goes in a sequence of 3,
2, 3, 2 and so on. It can be said to "pull down" a whole frame and
split it into three fields and two
fields. Finally, after the Telecine machine has finished the forth
frame D, it will start the process all over again with the next four pictures
of the movie.
In short we end up with:
At Ab At / Bb Bt / Cb Ct
CB
/ Dt Db
But because it always goes: top, bottom, top,
bottom, top, bottom etc., we would just say it without indicating
top or bottom fields. So instead of the above we would describe it as:
AAA BB CCC DD
Whatever way you look at it in the end you end up with 5 whole frames
instead of 4. This turns a 24 fps movie into a 30 fps movie!
Interlacing the picture back together
Lets look at the picture again. Look at the third line down. Here we
can see how these fields would be woven back together to produce a whole
picture again, as we would see on a TV or computer screen. The top field
of frame A is woven together with the bottom field of frame A. Then the
repeated
top field of frame A is woven together with the bottom field of frame B.
The top field of frame B is woven with the bottom of frame C. The top field
of frame C is woven together with the top repeated field of frame
C. And finally, the top field of frame D is woven together with the bottom
field of frame D.
That's quite a mouthful to explain in words but examine the picture,
it should really explain itself. Since each frame is stuck together instead
of describing telecine by saying it uses top, bottom,
top, bottom in the order:
AAA BB CCC DD |
We would say: |
AA AB BC CC DD |
The change is only how we group the letters of course and means nothing
more.
A Weird Framerate
This is not quite the end of the saga. The old black and white TV's
used to play back at a perfectly round 30 fps. But as usual NTSC found
a way to destroy that perfection! With the introduction of color TV it
was decided (because of technical reasons) that the movie must be played
back at 29.9700299700299700... fps (59.94Hz) which is basically only
99.9% of its full speed. As a result NTSC movies still have the same amount
of frames they did when they were telecined, but they are played back at
a fractionally slower rate.
2:2 Pulldown PAL
PAL movies also get telecined but not in the same way an NTSC movie
does. A Telecine machine will use what is sometimes called 2:2 Pulldown!
This basically turns every frame into two fields so they can me played
on a standard PAL television. This makes 25 frames into 50 field which
when played on a TV set at 50Hz will produce 25 whole frames per second.
So instead of going 3, 2, 3, 2, 3, 2 it will
go 2, 2, 2, 2, 2, 2! This produces the fields:
At AB / Bt Bb / CT CB /
DT dB |
Or just: |
AA BB CC DD |
Again, a PAL movie will contain all the frames from a 24fps film with
no
additional ones, but it will still play those frames back faster at 25
fps. In a way of speaking it is just as correct (or wrong) to say that
a PAL movie is 24 fps because no frames have been added to it, they are
just played back faster.
INVERSE TELECINE (IVTC)
I think I'm correct in saying that there is no such thing as an Inverse
Telecine machine :). But, as the name suggests, inverse telecine is a process
that turns a 30 fps movie back into a 24 fps movie. Basically what it does
is take out all those extra fields that were added to the movie to make
it 30fps. Its about now that I start spluttering because this is an awkward
subject and I can't find any information on exactly how Inverse
Telecine is performed! So instead I will describe what "looks" like should
be done based on how it was telecined in the first place.
Lets go back to our picture! As you can see from the second row down,
to turn the 24fps movie into 30fps we have to separated the pictures into
10 single fields (or half frames) by adding two fields that shouldn't normally
be there. Counting from left to right, all we would need to do to turn
or 10 fields back into 8 fields (to turn 30 fps into 24fps) is to delete
fields 3 and 8. Remember we are talking fields here not frames.
But taking out fields 3 and 8 would produce a movie that had a field
order of: top, bottom, bottom, top,
bottom, top, top, bottom! Since you cannot weave together
two bottom fields or two top fields we would need to swap
them around. So imagine the order of the numbers as:
1, 2
|
3, 4
|
5, 6
|
7, 8
|
T, B
|
B, T
|
B, T
|
T, B
|
To get the correct order we must change them to:
1, 2
|
4, 3
|
6, 5
|
7, 8
|
T, B
|
T, B
|
T, B
|
T, B
|
Which gives us an order of: 1, 2, 4, 3, 6, 5, 7, 8 which should theoretically
fix everything.
The Framerate Mystery Unraveled! 23.976 / 24 / 29.970 / 30
If the only framerates we use are 24, 25 and 29.97 then why to people
speak of using 23.976? This is to do with how the movie has been created.
A 25 fps movie still has the same amount of frames as a 24 fps movie because
none have been added. But nevertheless a PAL television chooses to play
them back at 25 fps. This makes the PAL movie play back at a slightly shorter
length and means the audio will be out of synch slightly. To compensate
for this, when a movie is telecined they apply to it what is called a 'pitch-correction'
which speeds up the audio to match the playback speed, in the case of PAL
this means they perform a pitch correction of about 4%.
The amount of frames an 3:2 pulldown telecined movie has is 30 fps.
But an NTSC television will play them back slower at 29.970 fps (59.94Hz).
The amount of actual frames hasn't changed, none have been added or taken
out! Here is where the 23.976 part comes in! If we inverse telecine a 30
fps movie we would end up with 24 fps. But if we inverse telecined a 29.970
fps movie, because it has a slightly slower speed, instead of getting 24
fps as we should, we will end up with the slightly slower rate of 23.976
fps.
PROBLEMS WITH INTERLACED MOVIES
Interlaced movies look fine on a standard TV but for some unknown reason
they appear terrible on a PC monitor!? Lets take a look at our example
one last time to see why. Look at the last row where it shows how the top
field of frame B is interlaced together with the bottom field of frame
C.
We are getting the top and bottom fields from two completely different
frames!! Imagine taking half of one picture and half of the next and trying
to put them together into a single picture, its impossible! On a PC this
produces what we see below. Here we have Star Treks William Riker walking
across the room from left to right. Notice that the top field from the
previous frame shows him a little to the left and the bottom field of the
next frame shows him a little to the right. This is what produces this
combing effect and no amount of shifting the lines to the left or the right
will fix it!
Inverse Telecine Troubles
Look at our illustration one more time. A 3:2 pulldown movie can also
be encoded as 2:3 which produces exactly the same result but is
done backwards - instead of getting 3, 2, 3, 2
we will get 2, 3, 2, 3! But this doesn't matter
because since a 3:2 pulldown movie can be cut and edited after it is made
the very first frame doesn't always start with the top of field of A
anyway! It could, for example, start with the next one across - the bottom
field of A. In fact, it could start with absolutely any of
the 10 fields in the sequence!
Hence as far as I can see there must be at least 10 ways to perform
inverse telecine. Five assuming the first field is top and five assuming
the first field is bottom. Let me know if you know exactly how IVTC works
and I'll update this article to explain it better.
OTHER ISSUES
Some of the specials and extra features of a DVD seem to have been recorded
from a telecined 29.970 fps source! This means that the interlaced picture
is actually edited as an interlaced picture on a computer and then reinterlaced
again! There is absolutely no way to fix such a problem because the lines
are literally a part of the original picture now. For example, I have taken
this frame from the trailer of one DVD and separated the fields into two.
When I squash all the lines together from one single field I get
the following picture:
Of course, I could be completely mistaken about this, but that is what
appears to be the case.
Capture Cards
Most of the Graphics Cards, TV Tuners and Video Capture hardware we
use to record video to the PC will not perform any kind of IVTC. Neither
do they seem to give a damn what order they whack the TV fields together.
This means regardless of if you use PAL or NTSC, if you want to capture
any video footage at above 240 pixels high (for NTSC, PAL is 288) you will
get at least some interlace problems! When you are capturing below 240
pixels the capture card will only use one field and hence interlace problems
will be almost impossible. If your capture card can get larger picture
without problems check the instructions to see how its doing it! You may
find to your horror that it is actually just capturing at 240 and enlarging
the picture after it has been captured. This is obviously a serious
waste of space!
Deinterlace filters
Since to perform inverse telecine (IVTC) to make a 30 fps movie back
into a 24 fps movie is so awkward there are a few alternatives that have
been designed to work on just about any movie. There are only two types
that I know:
Bobbing: To Bob basically means to enlarge each field
into its own frame by interpolating between the lines. So from one field
we are producing a full frame. Because the top fields are a line higher
than the bottom the image may appear to "bob" but this is usually fixed
by nudging the while frame up or down a pixel. You are only really getting
half the resolution with bob but the interpolation is usually very good
quality. If you are stuck for a way to bob your video my AVISynth guide
offers a bob feature, check it out Here.
Blending: Flask Mpeg's deinterlace filter look for the
parts of a picture where the two fields do not match and blends the combing
effect together. The lower the threshold the more the two parts are blended
and the less of a combing effect appears. The problem with this method
is that the final picture can quite often end up a bit more blurry.
DVD & TELECINE
DVD's offer a strange twist to the whole Telecine and 3:2 pulldown business.
Almost all DVD's will have the movie stored as whole pictures at 24 fps.
This is the original format of the film with no Telecine. At the start
of every Mpeg-2 DVD file there are certain header codes that tell it how
to play back the DVD. Since it is stored digitally it can give the fields
or frames from the DVD and to the hardware or software in any order it
likes. It can split the movie into two fields and perform telecine instantly.
To do this has three flags that can be applied to the header code: RFF
(repeat first field) TFF (top field first) and FPS (frames per second).
For a PAL DVD the FPS flag can be set to 25 and the DVD will send the
picture information to the hardware at 25 fps instead of 24 fps as is stored
on the DVD.
For NTSC DVD's the movie needs to be 29.970 fps so the FPS flag is set
to 29.970. But this looks odd because the movie is over far too soon. Imagine
it like playing cards, if you throw 4 cards on the floor every second the
whole pack will be finished in half the time than if you threw 2 cards
onto the floor. The solution is to telecine the movie with 3:2 pulldown
to increase the amount of "cards" we have to start with. To do this it
uses the RFF and TFF flags are set in the header code. By setting the DVD
to Repeat the First Field again you make the video display the fields in
the order 3, 2, 3, 2. By setting the TFF flag you set the DVD to start
from the top field so the order always goes: top, bottom, top, bottom.
Theoretically then, it should be possible to patch the header code of
a DVD's Mpeg-2 file and make it play back at 24 fps instead of the 29.970
fps! In fact some people have made patches to do this, but so far, for
another unknown reason they are very unreliable and the video turns out
just as bad!
Progressive and Interlaced together!
I don't think I have mentioned what a progressive image is yet? A progressive
image is a whole frame that it is not interlaced. Motion picture camera's
capture images that are progressive. They are not telecined or split into
separate fields. Computer monitors do not need to interlace to show the
picture on the screen like a TV does it puts them on one line at a time
in perfect order i.e. 1, 2, 3, 4, 5, 6, 7
etc.
Many DVD's are encoded as progressive pictures, with interlaced field-encoded
macroblocks used only when needed for motion. Flask Mpeg tries to take
advantage of this fact, because if you set it to 24 fps (or 23.976) it
will give the option to reconstruct progressive images. This does not perform
any deinterlacing on the video but ignore all the flags and just reads
the DVD one progressive image at a time.
This is another confusing issue for me. I have no idea how a DVD movie
can be both interlaced and progressive other than by the fact that a progressive
movie can be played back as interlaced due to control flags. If I learn
any more about this I will update my articles accordingly.
VHS, VCD & DVD
To finish, perhaps it would be nice to say a few words about the video
formats too. It wasn't long after TV that VHS video recorders appeared
on the scene and a yet a while latter when the Video CD-Rom's did. Of course,
there were other video formats, but VHS (Vertical Helical Scan) and MPEG
(Moving Picture Experts Group) won the battle, at least as far as home
video was concerned. This is a little strange really because Sony's Betamax
video was probably the better quality! Anyway, all video formats to date
have required one form of compression or another to be able to record the
huge quantities of information needed to store full motion video.
VHS
VHS video is stored just like audio on a reel of plastic tape impregnated
with ground up iron. This plastic tape is spun in front of an electromagnet
that replicates the strength of the TV's electric beam as they would appear
scan across the screen. This caused magnetic 'kinks' in the iron parts
of the tape that are almost identical to the original TV signal. A reversal
of this storage process would produce the image back on the TV screen.
The signal is simplified before it reached the tape therefore making it
take up less space.
As anyone who has ever used video tape knows it soon looses quality.
It appears grainy, looses colour accuracy and starts to produce white glitches
and audio waver - a better solution was needed.
MPEG-1
As computer technology advanced CD-Rom video formats became popular
and the Moving Picture Experts Group designed a compression format that
could store over an hour of VHS quality video on a single CD-ROM This soon
become very popular in the east but never truly caught on anywhere else.
This was due to the fact that recording it was difficult and slow and the
quality was not really any better than normal VHS anyway. The big big advantages
of Mpeg-1 video was that it was almost impossible for it to loose picture
quality like a VHS videotape! It could last perhaps over 100 years of use
without any noticeable degradation of image quality!
MPEG-2
Since (at the right bitrate) Mpeg-1 was able to produce TV quality pictures
superior to VHS, the Mpeg organization decided to design another version
that allowed Mpeg-1 back with interlaced images so it could be used for
TV broadcasts. This format was called Mpeg-2. Other features were added
to Mpeg-2 to make it compress slightly better and higher quality, but the
main difference was the addition of interlace support.
Since Mpeg-1 VideoCD's showed that a CD based digital video was not
only a viable option, but also a very preferable that is one if the storage
space was enough. When CD-ROM designs were upgraded to be able to store
4.38 gigabyte or more of information, it was decided that these new CD's
would be the new storage media for video. It was called DVD to mean Digital
Video Disc although it was later changed to mean Digital Versatile Disc
because it was 'versatile' enough to hold other data besides video.
Resolutions
Resolutions are an important issue for amateur video enthusiasts who
want to capture their video at full TV quality. Professional video editors
are told to capture at 640 x 480 pixels for highest quality. But a PAL
TV resolution is 576 lines. Then we have the Mpeg group saying that 352
x 288 is the full VHS video resolution! The problem seems to lie
in the fact that its hard to equate a TV resolution with a computer image.
The TV is built up of lines but the dot definition is rather "fuzzy" looking.
So rather than me rattling on about the pro's n cons here I will merely
end this article by quoting what the Ligos corporation (the creators of
the LSX Mpeg-2 encoder) say in regard to this subject:
"The resolution of computer video, however, doesn't
generally equate to the video world of televisions, VCRs, and camcorders.
These devices have standards for resolution that are generally focused
on the horizontal resolution (the number of scan lines from top-to-bottom
that make up the picture). Here are some numbers for comparison:
Video Format
|
Horizontal Resolution
|
Standard VHS |
210 Horizontal Lines |
Hi8 |
400 Horizontal Lines |
Laserdisc |
425 Horizontal Lines |
DV |
500 Horizontal Lines |
DVD |
540 Horizontal Lines |
With these numbers in mind, it is important to
remember this rule when bringing the worlds of computer and video together:
the quality of an image will never be better than the quality of the original
source material.
We suggest capturing at a resolution that most
closely matches the resolution of the video source. For video sources from
VHS, Hi8, or Laserdisc, SIF resolution of 352x240 will give good results.
For better sources such as a direct broadcast feed, DV, or DVD video, Half
D1 resolution of 352x480 is fine. There are other advantages to following
these guidelines. Your files will be smaller, consuming less space on the
hard drive or on recordable media like CD-R and DVD-RAM. You'll also be
able to encode more quickly". |
|