related documents Spatial-LLaVA: Enhancing Large Language Models with Spatial Referring Expressions for Visual Understanding Conference Proceeding