We put on the first demonstration of true megapixel analytics in the industry at our ISC West booth last week.
It was eye-catching. Lots of people stood there staring at the analytics detecting people, cars, trucks, motorcycles, sailboats, speedboats, etc. Here’s a picture:
Unfortunately, this blog can’t show the full resolution or video, which you really need to see to appreciate how incredible it looks.
When I say this is the first public display of “true megapixel analytics” I mean the resolution being analyzed is megapixel. There have been cameras with megapixel video that have had analytics processing before. CoVi is a good example, may they rest in peace. They sold a 1 MP camera that ran ObjectVideo analytics. However, the resolution of the analytics was only CIF (320 x 240 pixels), which gave hardly any detection range. It was silly to put CIF analytics on a megapixel camera.
Why hasn’t anyone ever demonstrated megapixel analytics before? Because of the sheer processing power that other technologies need to do this.
VideoIQ’s technology is different. We need about 1/8th the amount of processing power compared to other high quality analytics systems. So, we can run the whole thing in one of the popular low cost DSP processors. But all other analytics technologies need a lot more horsepower.
For example, ObjectVideo on their web site claims they can run up to 4CIF resolution video in the same DSP chip we are using. However, in most cases the users of OV onboard are only running CIF resolution, because there are serious limitations running 4CIF, such as only being able to have one rule running at a time and a limited number of objects that can be detected.
IOimage uses two DSP processors in their cameras to get high quality and avoid compromising detection.
The camera we demonstrated was a 1080p camera, which is 1920 x 1080 pixels. We demonstrated it live at the show, with the analytics all running in the camera. It provides 3X the horizontal coverage of a standard resolution camera, and more than 2X the anaytics detection distance.
For other technologies to run 1080p analytics, they would need more than 6 times as much processing power, compared to 4CIF video. That would mean 6 DSP chips, or some very expensive high end DSP chips.
If you try to run this on a server or a PC, you would need a full dual core processor to run one camera. So, you can see why it’s never been shown before. It is impractical for other technologies.
The other industry first we showed is something we call IQTrack. It uses the video analytics to automatically track and zoom on objects in the field of view. Here’s a picture:
This is different from PTZ camera tracking. If you look at the lower left of the picture, you will see that the whole field of view is still being recorded and it shows where in the scene you are zoomed into. So, you can always go back later and pick another part of the video to look into.
The other unique thing is, if many people are in the area, you can click on one person and it will zoom in on and track just that one person. That’s never been shown before either.
Watching it, you can immediately see that there is no comparison between watching video that is automatically zooming and tracking on important objects, versus static video cameras. It pulls your eyes to exactly what is important. I think this is going to be very popular for megapixel cameras.
The 1080p cameras we sold will also be the first cameras to ship with a new imager from Sony that has some amazing low light performance. We are still testing it, but it looks to be 2X-4X better than any other multi-megapixel imagers used in the security industry.
And of course, the camera we showed included a hard drive so that you can store 1-2 months of high quality 1080p video. This solves the bandwidth problem for megapixel cameras, since it needs no bandwidth to record, and eliminates the need for external storage in most cases.
Now that true megapixel analytics have arrived, I think it is going to set the standard, and I think it offers incredible visual value to megapixel cameras, even if you don’t want the analytics for detecting alarms.


April 2, 2010 at 9:24 pm
[...] True Megapixel Analytics Have Arrived « Spot On Security [...]
April 3, 2010 at 12:46 am
[...] True Megapixel Analytics Have Arrived « Spot On Security [...]
April 3, 2010 at 3:27 am
[...] True Megapixel Analytics Have Arrived « Spot On Security [...]
April 4, 2010 at 4:40 am
Hello Doug,
“Unfortunately, this blog can’t show the full resolution or video, which you really need to see to appreciate how incredible it looks.”
I guess that You should place those videos on – for example – vimeo, and here it would be nice to see resized video, just for overview.
Regards,
Marian
April 4, 2010 at 11:06 am
Hi Doug,
Nice article!
But would not it be more cost effective to have a better analytics performance ratio (<1%) on CIF format??
April 5, 2010 at 3:52 am
Hi Doug
Good to see the image on this & congrats on releasing the megapixel analytics camera.
I’m curios to know the delta performance increase that you got when you ran the analytics on 1080p. I still see some of the false positives such as classifying both boat & car same (assuming the blue color indicates the same). I also see some of the people being missed (people sitting on bench) & the person walking near the lamp. Perhaps the people at far end are too small to detect, which is okay. If you were to run the same analytics on a resized CIF video how many objects (people/cars/boats) would you miss & the false positives that you get. I’m sure you’ve done the maths already, if you can publish that ground truth data & do a comparison with CIF video, it would help your case of why analytics on megapixel camera makes sense.
If it’s okay to mention, do let me know the DSP chip you’re using as well.
April 5, 2010 at 3:54 pm
Marian,
We are going to be putting up some video samples on our website. We definitely would like to show it in action.
I’ve sent you a video clip via email so that you can see it for yourself.
Thanks for your comments.
April 5, 2010 at 3:56 pm
Joko,
Thanks for writing. I’m not sure I understand your question, however.
Can you explain what you mean by better analytics performance ratio?
Thanks.
Doug.
April 5, 2010 at 4:16 pm
Gautham,
First, those are not false positives with the boats and vehicles. We put blue boxes on both, intentionally. We rarely see a scene where both boats and vehicles need to be detected at the same time, so we always mark them as blue. Plus, with too many colors it just gets confusing what all the colors mean. So, we always use blue for both boats and vehicles. Those are all true detections.
Yes, people who are far away and not moving much are missed in this frame. But as soon as they start to move, they are detected. For example the person by the lamp stands still for a lot of the time, but when he does move, he is detected. Some of the people farthest away are not, since they are at the limits of the detection range, but most are. That’s why this is a good video, since it gives a good idea of the detection range limits.
If you resized this video to CIF and ran the analytics at that resolution, you would get less than 1/4th of the detection distance. So, picture the people that are being detected farthest away. Now cut that distance in half. Then cut it in half again. Your detection limit would be about 10% less than that.
In other words, you could detect the two people on the closest side of the road, but would probably not be able to detect anyone on the other side of the street. That gives an idea about how big the difference is.
Plus, 1080p video is a lot wider than CIF, so you can cover a much wider field of view, since it is 16:9 ration rather than 4:3. That gives you a 33% wider view, along with more than 4X the detection range.
As for the DSP, we are using one of the smallest in the TI DaVinci family.
Thanks.
Doug.
April 5, 2010 at 4:39 pm
Hi Doug, sorry for not being that clear.
1)Imagine, the minimum size in motion you can detect is 5%. at 1080 it should represent something like 322 px²
2)Imagine, the minimum size in motion you can detect is 1%.At the same resolution you would detect something like 32 px² which means that the distance range also depends on this performance and probably you could have the same distance range on CIF format. Nevertheless we will lost in FOV…
The question is: Does VCA manufacturers releases this performance factor (smallest detectable area) in order to let us integrator judge on what fits better for our projects?
Best.
Joko
April 5, 2010 at 6:32 pm
Joko,
Okay, I understand your question now. Thanks for explaining.
Yes, everyone tries to get the best possible detection with the fewest possible pixels. A lot of research has been spent by everyone on this, so there are no big opportunities for making this a lot better. That’s why you need more pixels if you want to improve detection range.
For example, our technology can do reliable detection with only about 5 pixels per foot. That’s pretty good in the industry. If you look at the person farthest away who is in the middle of the picture, that person is just about 33 pixels tall in the original video. The two people standing on the closest side of the street are about 115 pixels tall.
If you viewed this same scene with a CIF resolution imager, those two people on this side of the street would only be about 25 pixels tall. So, they would actually be the limit for detecting a person 5-6 feet tall with CIF resolution.
Even people who are watching video carefully can’t accurately recognize people at much less than 5 pixels per foot. There just isn’t enough information to go on. There are some simple motion detection approaches that can get you down to 2 pixels per foot, but they are highly unreliable and generate far too many false alarms. People would be making the same kinds of mistakes all the time with objects that small. You would be guessing at whether it was a person or something else.
Thanks.
Doug.
April 5, 2010 at 11:20 pm
I get what you saying but if you want to detect people crouching or crawling towards camera,these little details are important.
I agree the level of false alarm needs to be reasonable.
April 6, 2010 at 11:43 am
Joko,
I agree the little details are important. That is exactly why more pixels are needed. Without more pixels, you can’t see the details.
Taking the example that you gave, if you had a person captured with 32 pixels, this would be about 9 pixels tall and 3-4 pixels wide. With only 3-4 pixels wide, you aren’t going to clearly see their arms or legs.
When you look at the distant person in the image, in the middle of the scene, as I said, it was about the 33 pixels tall by maybe 10 pixels wide. You can then see the legs and arms moving, which enables a high quality recognition of this object as a person.
If someone wants to look at pictures or video to recognize who a person is in the picture or to read a license plate, you need a lot more pixels. The general rule is that you need about 50 pixels per foot. That would make a 5.5 foot tall person 275 pixels tall. That’s twice as tall as the closest people in the picture above. If you were using CIF resolution, it would take up the full image, meaning that the camera would have to be looking exactly them, since you wouldn’t be able to see anything else in the scene but one person.
That’s why higher resolution is so valuable for recognition.
Thanks.
Doug.
April 14, 2010 at 11:41 am
[...] True Megapixel Analytics Have Arrived « Spot On Security [...]
April 24, 2010 at 1:54 am
Excellent share. Thanks for informing us.
January 24, 2012 at 10:20 am
An answer from an expert! Thanks for contrbiuintg.
January 25, 2012 at 4:08 am
jvXb1P yjxihyuswtvc
February 25, 2012 at 9:22 pm
Can you create them in detail?