Naver ventures into AI image search business

    2024.08.22 11:13:16
    • print
    • email
    • facebook
    • twitter
    • linkedin
    Naver announced that it will add visual processing capabilities to its AI agent, CLOVA X. The image shows the ‘chart understanding’ feature of HyperCLOVA X. [Courtesy of Naver Corp.]
    Naver announced that it will add visual processing capabilities to its AI agent, CLOVA X. The image shows the ‘chart understanding’ feature of HyperCLOVA X. [Courtesy of Naver Corp.]

    South Korean platform giant Naver Corp. is set to integrate image recognition capabilities into its conversational artificial intelligence (I) agent service, CLOVA X. Global tech giants such as OpenAI Inc. and Google LLC are moving towards more sophisticated chatbots by developing multimodal AI that can simultaneously understand and process various forms of data, such as text, images, and speech, while Naver is focusing on a free model to achieve user lock-in effects.

    According to sources from the information technology (IT) industry on Wednesday, Naver plans to add multimodal functionality to CLOVA X later in August 2024, which will allow the recognition of images and related query responses. This feature may be available to general CLOVA X users.

    Users could upload a photo of a math problem, for example, to request an answer from CLOVA X or ask it to create a poem related to the image. This would be a step up from text-based interactions to image-based ones, as CLOVA X could previously handle documents like PDFs, TXT, HWP, and DOCX and engage in related conversations. But it could only recognize text within documents, limiting interactions with charts or graphs, and users will now be able to request tasks such as creating proposals based on graph data, allowing for more specialized functions.

    Naver is also testing image editing features on CLOVA X, allowing users to delete or modify parts of uploaded images. These features, including changing image backgrounds or altering the colors of clothing, will be rolled out to all users in stages after further refinement and the exact release date for these editing features is yet to be determined.

    LG AI Research recently introduced an image-based query response agent in the trial version of its generative AI service Chat EXAONE 3.0, but this service is restricted to LG employees and no official release date has been set yet.

    Meanwhile, major global tech firms have already integrated image search AI into their services. Google’s AI chatbot Gemini supports image uploads for Q&A and similar features are available in OpenAI’s ChatGPT and Anthropic’s Claude chatbots.

    Global tech companies are also developing AI agents capable of recognizing speech and video.

    “The integration of image search AI enhances usability and accessibility at the input stage, making image-based searches more useful,” Ha Jung-woo, head of Naver Cloud’s AI Innovation Center, said.

    Meanwhile, image search and analysis AI are not limited to conversational AI agents. Google introduced Google Lens, which provides real-time information about images via smartphone cameras. Image search AI is expected to be particularly powerful in chatbots and e-commerce, allowing users to easily find similar products by uploading pictures of the desired piece of clothing or items.

    By Ko Min-suh, Lee Sang-duk, and Lee Eun-joo
    [ⓒ Pulse by Maeil Business News Korea & mk.co.kr, All rights reserved]
    ad

    weather

    • Seoul
    Get Newsletters