The major use of multimodal models is the ability to understand images and respond (majorly code), but you(.)com currently doens't support this! Pleasae make this a priority.