420, 422, 444, rescaling and colors flame

Vitaliy_Kiselev

Made from http://www.personal-view.com/talks/discussion/9542/gh4-4k-panasonic-video-camera-official-topic#Item_798

Intended for flame only, theories of "different magical colors" and such.

inqb8tr

I am not sure about this: monitor is 1920x1200, but on YouTube when we watch GH4 footage @ 4k it looks much better than when 1080p option is selected. Everybody notices that, right. But, is it just due to YT's higher bit-rate compression for 4k, or some downscaling benefit already kicks in, even in playback? Or what happens? Sorry if this is stupid question, I'm a bit tired right now and my brain refuses to figure anything out on it's own :)

DrDave

@GlueFactoryBJJ no one is saying you are lying, I am simply voicing my opinion that your math is not correct as far as digital media goes. We have a difference of opinion. I explained that the way effects are calculated using placeholders, and I explained why these placeholders allow the NLE to reconstruct exactly algebraic processes. The NLE will not calculate the series to infinity, it will use one of several different systems to deal with that last digit. Now, I could be wrong and you could be right, and my calculator could also be wrong, and my DAW as well. I actually never really wondered about that, I just went as far as checking that if I burn a CD or render an MP3 I get the same "number" each time (which is impossible with dither, for obvious reasons).

Windows calculator: how to use.

Open calculator. Click view, select "scientific"
Type 2, then the square root button, you will see a long number, but you will see a superscript showing that the computer is recognizing an algebraic process.
Click the x2 button
View the correct answer, which is 2, exactly

As for rounding, all digital media uses rounding and/or dither to shape the last bit. If you have alternative, I don't know what that would be, but you are welcome to develop your own system. It would be difficult to design a system that holds an infinite number of placeholders, but perhaps by using a closed logic system with a non-binary base you could do that. However, such a system would give you the same results as the one currently in use.

There is an alternative to rounding/dither which is truncation. Truncation means simply that you throw the last digits away. Certainly an interesting question as to whether in a high bit space any human could tell whether the last bits were dithered or truncated.

As far as using extra data from downsizing to recreate the extra bits, I would be convinced by a mathematical demonstration of the exact process so if you have one, I would be interested in seeing it.

GlueFactoryBJJ

@DrDave - And when you don't like the answer someone else gives you, you can always say they are lying and that they are wrong. Frankly, I don't care if you believe me or not, I know what my intent was when I made the statement.

Also, don't use rounding as a method of making your assertion correct. Rounding doesn't make for a correct (or precise) answer, it just makes for one that is "good enough". If you like, use the calculator in Windows, take the square root of 7, copy it to the clipboard (~20 places), then clear the calculator, paste the number you have back in and square it. You will get 6.99etc to a large number of places, but NOT 7. As I said, an analogy, extreme, but an analogy.

When I talk about them not being the same (12-bit RAW vs 8-bit lossy compressed), I'm not talking about rounding errors or the difference because of dither placement due to random number generators (the quality of the random number generator brings up another whole discussion). This is both testable (as has already been done here and other sites across the internet) and proven as a fundamental truth in information theory. Data can be CREATED, but it can't be duplicated once it has gone from 12-bit to 8-bit and back to 10-bit, even with the best of algorithms.

Even @Vitaliy has admitted as much in an earlier post. Why people keep trying to read more into what I post than what I've said, I just do not understand.

As I have tried to say in three previous (but edited/deleted) posts. Y'all win. I'm through with this thread!

DrDave

@GlueFactoryBJJ In math, when you make a mistake, you can call it an "analogy," but it is still a mistake. A rounding calculator will correctly reverse an irrational number, and the same is true from digital audio or video, if the code is written properly. Furthermore, an irrational, if algebraic, is easily reversible.

As far as "identical" goes, there are of course different scenarios depending on dither. You can have two images that are absolutely identical, with the same dither, but which are mathematically different because the dither, although unobservable, is produced by a random number generator in some cases. In this case the math is useless as toll to test whether the images are the same, or produced "losslessly".

GeoffreyKenner

Keep in mind also that there's no guarantee at all the sampling of 4 pixels will be closer to a value in the 10 bit space rather than the 8.

BurnetRhoades

Re-scaling doesn't remove artifacts. That was always "wishful thinking" and had been proven already in this thread (page 3). The simple procedure that can be analogous to playing with buckets or balls is going to be the single least effective method of trying to get a free lunch with this stuff. There's no magic. It's not a problem that can really be solved with simple or clever spatial filtering alone.

caveport

Vitaliy, I will be very interested if someone can down convert and get a true 10 bit image, but for now it does not exist. I'll leave it to the hardcore coders to make it work. I'll just shoot 10 bit files, it's a hell of a lot easier!!! :-)

Vitaliy_Kiselev

Yes you can get 10 bit data, but no, it will not look like 10 bit from a camera. End of speculation for me.... I don't guess, I test!

Cool, happy for you. It is just not test as it does not have properly defined procedure (add here that no one knows how exactly rescaling is being made) and is all but guess.

caveport

Jut tried the down convert command line utility mentioned a few pages back..... Quote: Hey guys, I’ve written a really simple command line app for Mac that will resample GH4 footage from 4K 4:2:0 to 2K 4:4:4 using pixel summing. This will give you real 10 bit data in the luminance channel, so it’s not just doing a brute-force bump from 8 bits to 10 bits. There actually is some interesting pixel finagling going on here.

I did the conversion and brought original mov and dpx files into Resolve for grading. The result? Perfectly preserved 8 bit artefacts. Surprise.

Yes you can get 10 bit data, but no, it will not look like 10 bit from a camera. End of speculation for me.... I don't guess, I test!

Vitaliy_Kiselev

I have tried to show you, repeatedly, even using your own example, that while you can get CLOSE, it isn't (always) EXACT.

Close to what?

Did you spend time and without emotions just read that had been written?

It is also good to understand that it is discussion here no one is forcing someone to share any views.

So, be easy.

GlueFactoryBJJ

@DrDave - The square root example is an ANALOGY. Since the sensor records data with ~12 bits, but the data is only recorded in 8-bits there is a loss in accuracy. Just like when you take get a square root from a calculator (to, say, 20 decimal places), but are only able to use 3 decimal places when trying to calculate the square. Yes, it was extreme, but I didn't want to use 20 place accurate numbers.

@Vitaliy - I have tried to show you, repeatedly, even using your own example, that while you can get CLOSE, it isn't (always) EXACT. Which is all I've been trying to say. You keep jumping from one position to another, ignoring/belittling my (and other's) efforts to illustrate my point.

@GeoffreyKenner - 8-bit contains 256 data points. These are represented as 0-255. Thus, the 255 value in my example. Also, 10-bit is 0-1023 for 1024 data points.

Mistas

@caveport, I don't have any allegiance to VK or anyone else...but I don't think he is saying that 2k10bit is an accurate representation of the source 4k8bit footage; all I see is simply that mathematically it really is 10bits of data coming from an 8bit down-sample.

That does not mean it solves the limitations of banding, loss of color info, or any of the other side effects of the source 8bit signal.

Correct or no? @Vitaliy_Kiselev

Vitaliy_Kiselev

I think the buckets analogy is flawed. I suppose the topic title does contain the word 'flame' so I should not expect too much.

Cool argument. :-) It is not flawed, as it is made to explain basic things.

caveport

@Vitaliy I think the buckets analogy is flawed. I suppose the topic title does contain the word 'flame' so I should not expect too much. Sorry for wasting everyone's time.

karl

To check some practical results I used ffmpeg to convert some GH4 4k samples to yuv444 and to scale them down to 1920x1080. While the looks of the downsampled images were pretty good, I was shocked to see how little hardware support there is for yuv444: There is "theoretical" support in the HW acceleration APIs like vdpau and vaapi, but none of the computers I own has a GPU that actually supports yuv444 overlay visuals.

So the only way to play back and watch the yuv444 encoded video at full quality was to use the RGB "x11" display driver, which is kind of wasting a lot of CPU cycles.

Vitaliy_Kiselev

Criticising my education when you have no knowledge of me or my background is not very polite. It is easy to slam people. I have tried to avoid that. Most people on this thread have no idea about how digital sampling works or how digital YCbCr signals work.

I did not touch you or your background. I just stated obvious thing that was about general education level and level of discussion. And, in fact I never touched "digital sampling" or "YCC".

VK please stop being so condescending/sarcastic and try to help others see where they're supposedly wrong. I, too, think that by converting the original 12-bit data off the ADC into 8-bit you have already discarded information which can never be recovered but can only be "guessed" with a clever algorithm.

So, let's read CAREFULLY that was written:

We have raw data, let is be a1, a2, a3, a4. Raw data is linear 12-bit data.
Resulting 8-bit values will be b1=f(a1), b2=f(a2), b3=f(a3) and b4=f(a4). Were f(x) is non linear transform. Usual approach is try to preserve mid tones more accurate and compress shadows and highlights more :-)

As it is quite clear to see f(a)=f(a1+a2+a3+a4) is generally not equal to b=f(a1)+f(a2)+f(a3)+f(a4)

We are NOT talking about recovering some original raw data for individual pixels of 4K image.

We are talking about resulting 10-bit values after rescaling. Is it correct? Is it really 10-bit data? Yep. Are they equal to data made using first rescaling in raw and later making transform? Generally, no. But they are pretty close and if you know how to properly set camera settings and using so called flat picture profiles you will get pretty accurate data.

@caveport

If one considers the case of a graduated luminance signal in 8bit, from left to right (pixel 1 to pixel 4096 for example) which has a variation of only 2 discrete 8bit 'steps' i.e. 235 to 236, across the full width of the image, then the mathematical transform will perfectly preserve those same values in a 10bit image, as there is no variation in the math until the value changes from 235 to 236. The banding is perfectly preserved.

We have issue here.

Again. We are NOT talking about recovering some original raw or even 10-bit non linear data for individual pixels of 4K image. We are talking about image after rescaling.

Let's make example reverse for you to make easy to understand big logic flaw. Suppose you have X (between 1 and 1024) small balls in one big bucket. Now you get another 4 buckets, each is exactly 1/4 of big one (let's call them - bad banding buckets :-) ). You put your hand in big bucket, get one ball and randomly (with exactly same probability!) place it in each of 4 small buckets.
After you moved all balls, it is clear that it is all messy banding shit.
Let's make reverse process. Put all balls from small buckets to the large one. To our surprise we again have same X balls, and as X is arbitrary it is clear that we preserved all information and strangely got real 10-bit from horrible 8-bit.

GeoffreyKenner

@GlueFactoryBJJ: I never understood the "255" thing. 0 IS a value, it's 256.

@caveport: "It's not mathematics, it's geometry!" -> You made my day.

Beware of downscale, higher bit have special proprieties like super white and super black that can be clamped. With 8 bit you'll never be able to retrieve this information no matter how many downscale you performs. Burned highlight in 8 bit are burned highlight in every bit you can try to convert it to. Same goes for line skipping although there exists methods to attenuate the effect in the sky (like adding a thin layer of grain and then performing a denoise to mix and smooth the pixels).

Otherwise, Vitaliy pretty much summed up the thing with kid version. It's a bit more technical in reality but you get the picture.

BurnetRhoades

@Mistas

The more important discussion is how much of a benefit will we see in a 10bit 2k, not whether or not it is "true" 10bit.

Using these methods employed so far, simple spatial filtering, the answer is "not much". The things you'd really like it to just solve for you, like poor gradation, compression artifact and ugly digital noise are left intact during the reduction. You gain a little back from the color sub-sampling but there are cleverer methods to achieve pseudo 4:4:4 from 4:2:0 without changing resolution.

Mistas

Maybe I am missing something but in my mind the most important consideration here is that resolution drops 400% from 4k to 2k. It is impossible to get a perfect down sample no matter how many bits. Even if you went 4k 10bit to 2k 10bit there would be inaccuracies with data because many pixels are gone. Is averaging pixel data from 4k to 2k at equal bit depth any different than adding the 8bit values to get a 10bit value? I doubt it would be noticeable.

The more important discussion is how much of a benefit will we see in a 10bit 2k, not whether or not it is "true" 10bit.

BlackLegSanji

VK please stop being so condescending/sarcastic and try to help others see where they're supposedly wrong. I, too, think that by converting the original 12-bit data off the ADC into 8-bit you have already discarded information which can never be recovered but can only be "guessed" with a clever algorithm.

That f(a)=f(a1+a2+a3+a4) not being equal to b=f(a1)+f(a2)+f(a3)+f(a4) is exactly why going from 4K 12-bit to 4K 8-bit then back to 2K 10-bit will not yield results as accurate as going directly from 4K 12-bit to 2K 10-bit.

caveport

@Vitaliy Your mathematics is very good. But this application of math to produce a 10bit image will not eliminate banding in 8bit images such as skies with very gradual luminance and chroma variation.

Criticising my education when you have no knowledge of me or my background is not very polite. It is easy to slam people. I have tried to avoid that. Most people on this thread have no idea about how digital sampling works or how digital YCbCr signals work.

We do not have Raw data in 12bit. We have 4k 8bit 4:2:0 which we want to turn into 1920x1080 10bit 4:4:4.

If one considers the case of a graduated luminance signal in 8bit, from left to right (pixel 1 to pixel 4096 for example) which has a variation of only 2 discrete 8bit 'steps' i.e. 235 to 236, across the full width of the image, then the mathematical transform will perfectly preserve those same values in a 10bit image, as there is no variation in the math until the value changes from 235 to 236. The banding is perfectly preserved.

The colour accuracy or sharpness of colour channels will improve due to the size reduction to a 1920x1080 image. But bit depth is about how many discrete levels of brightness of the Y signal are stored and accuracy of the vector information stored in the CbCr channels.

It's not mathematics, it's geometry!

If you can make your math solve the problem of having to use higher bit depths to capture more information you are likely to have many dollars coming your way as this is what all the major imaging companies have been trying to achieve since the beginning of digital imaging.

BTW, I did NOT have a 'modern' education.

Vitaliy_Kiselev

@GlueFactoryBJJ

Let's try again. Simple case again, only b/w and only 2x2 image.

We have raw data, let is be a1, a2, a3, a4. Raw data is linear 12-bit data.

Resulting 8-bit values will be b1=f(a1), b2=f(a2), b3=f(a3) and b4=f(a4).
Were f(x) is non linear transform.
Usual approach is try to preserve mid tones more accurate and compress shadows and highlights more :-)
f(x) changes according to camera settings.

With rescale all we get is single element b. Simplest method is to make b=b1+b2+b3+b4 (remember! we do not have a1..a4 available!).
As it is quite clear to see f(a)=f(a1+a2+a3+a4) is generally not equal to b=f(a1)+f(a2)+f(a3)+f(a4)

DrDave

@GlueFactoryBJJ Your example about the square root is not correct as far as certain calculations like bit depth. If you are "holding" 12 places, the rounding will produce exactly the same result. It is only when you are adding places that you get a different result. Cameras (and audio) do not add more than a few floating places, which are finite. However, if you are adding random number dither, you will of course get a different result. That is the nature of dither.

@Vitaliy rofl

Vitaliy_Kiselev

That is why JPGs always look different from RAW files.

Please, ask people to sit before this statement, as it is dangerous to their health if they will be standing.

Let me rephrase it a little for you to understand absurdity:

"That is why roast beef always look different from cows."

GlueFactoryBJJ

@Vitaliy and others - I guess I must be really dense this week. I can't see how you can say that 2^12 gradations (4096) is the same as 2^8 (256) regardless of whether it is linear or not. I never said that 8 bit (log) couldn't cover the same DR, just that there are not as many data points to use to recover detail, especially in the shadow areas. Unless I am completely misunderstanding you, it is like saying that measurements in whole meters (8-bit) are equivalently accurate to whole centimeters (12-bit).

Exposure To The Right (ETTR, referring to the histogram, full description on the Luminous Landscape site) uses the log allocation of data points as the basis for the technique.

I GET that, mathematically, you can get 10-bit values from 4 * 8-bit. Not a problem. However, let's take a more detailed look at the analogy you have used.

Let's say you have 4K 10-bit data that is being converted into 8-bit data (as is the case with internal recording on the GH4). We have 4 "buckets" as you have described them, using 10-bit monochrome, as you did to simplify the example. Here are the bucket values:

Bucket 1 = 1020 Bucket 2 = 1021 Bucket 3 = 1022 Bucket 4 = 1023

Now let's convert these to 8-bit "equivalents".

Bucket 1 = 255 Bucket 2 = 255 Bucket 3 = 255 Bucket 4 = 255

Now let's convert these 8-bit equivalents to 2K 10-bit, using your additive method.

255+255+255+255 = 1020

Now let's convert the original 10-bit values to a 2K 10-bit value.

(1020+1021+1022+1023)/4 = 1021.5 ~ 1022.

Ok, only about a .2% error. However, when we look at COLOR values, the interaction of the CbCr can result in color changes that can be more noticeable.

Hey, it is close, but not exact. That is why JPGs always look different from RAW files.

Howdy, Stranger!

Categories

Tags in Topic

Top Posters