The Wandering Lensman: Amazing! Startling! Even Scary On Several Levels!

We all know that artificial intelligence (AI) is coming to photography in a very big way. We've already seen it in our cameras in the form of features and abilities that were previously not possible. We've also seen it in our phones on a much greater scale. Get ready because I believe AI will change a lot about photography in the near future.

I was sent this YouTube video (link below) by my son who deep diving into fundamentally how all of this AI works. I was quite startled by not only what the presenter in the video does but by the ramifications that immediately came to mind. I recommend my fellow photographers watch it as you will be amazed. What's in the video?

You have probably heard of Chat GPT. It is an artificial intelligence program by Open AI. It is a language-based model that is designed for human interaction in a conversational manner. Effectively, it’s a simulated chatbot. People use it for various purposes that range from writing stories to writing school essays to drafting business plans to generating computer code. It, in and of itself, is quite amazing and controversial as it has quickly become so good, one has a hard time telling apart a machine generated answer versus a human written answer. Chat GPT is about words and thoughts. We continue...

You might have also heard of Midjourney. Midjourney is an artificial intelligence program and service that generates images from natural language descriptions called "prompts.” It is similar to OpenAI's DALL-E and Stable Diffusion. Midjourney is about images. You type in a detailed description of what you want your image to look like and the program creates it. The trick is in how you carefully and detailed you construct and type in the description in order to get the best results. It takes a lot of trial and error to initially generate the image you may want. Now...

What happens if you combine the two? That is what this video illustrates. The presenter 'teaches' (yes, teaches) Chat GPT to understand how the intricacies of Midjourney work, then uses Chat GPT to create the extremely precise "prompts" (which is harder for a human to construct) which he then feeds to Midjourney to create some amazing images. By combining two different types of artificial intelligence programs, he is leveraging the power and specialty of both programs to create prompts that Chat GPT understands and are of such detail that Midjourney can then create the most detailed and accurate images. In other words, he uses one type of AI to talk to another type of AI to get the desired result.

The images shown in the video are not perfect but remember all this is still in its infancy. In a couple of years, I predict it will be difficult to tell the difference between a digital image made with light, sensor, lens and camera and one that is completely artificially generated. I found it fascinating that artificial portrait-type images of individuals had a very shallow depth of field and landscape images had a very deep depth of field. When I looked at the instructions for generating those images, I saw that focal lengths and aperture values were given and evidently, the program understands what attributes are created with different combinations of focal lengths and aperture values. I know people who can’t seem to grasp that concept.

Think of the movie industry. As realistic as computer generated (CGI) parts of movies are now, there may soon come a time where actors aren't used at all. The entire movies, audio and video, will look just like real life and as though they were made as movies have been for the past hundred years, but no one is actually involved in acting or speaking. Only men and women sitting in an office environment writing a story (which realistically may actually be written by a program such as Chat GPT) and developing video "prompts" for another program like Midjourney to completely create the "movie" —which we the audience, will think is real.

Here is the scary part. We already know about and see "Deep Fake Video" with "Deep Fake Audio" that now show video portrayals of politicians, actors and other well known people saying things they never said. It is already almost impossible to tell the difference. Wait until this AI progresses over the next several years. Nothing will be real. Or, we can't be sure anything is real. We won't be able to trust anything we read, hear or see to be real.

Here is the YouTube video:

GPT-4 + Midjourney V5 = A New Era of Photography? - WOW! - YouTube

The question remains…..did I really write this or was this post generated by an AI program?

Join me over at my website, https://www.dennismook.com.

Thanks for looking. Enjoy!

Dennis A. Mook

All content on this blog is © 2013-2023 Dennis A. Mook. All Rights Reserved. Feel free to point to this blog from your website with full attribution. Permission may be granted for commercial use. Please contact Mr. Mook to discuss permission to reproduce the blog posts and/or images.

1 comment:

AnonymousApril 17, 2023 at 7:12 PM
I watched the 60 Minutes article and interviews this past Sunday on Google Bard. Truly amazing and frightening all at the same time. Especially alarmed when the Google executive said they did not know exactly how the AI programs were talking to each other. To me, that signaled they are in unexplored territory, without firm control, and the machines were acting independently of the controllers. Worse than Sci-Fi.
I recommend all to watch the 60 Minutes article on the CBS website.
This is a far stretch from what Topaz calls AI.

Friday, April 14, 2023

Amazing! Startling! Even Scary On Several Levels!

1 comment: