Posts by Collection

interests

Maps

gdpmap2021

This GDP Cartogram of the World during 2021 is a general representation of the world's economic output measured by GDP. One hexagon tile is roughly equal to 1/1000 of the world's gdp in 2021. GDP has its faults for measuring prosperity, but it is the total value of goods and services produced in a country and as such can give a rough idea.

Three regions primarily drive the world's economy, those being North America, Europe and East Asia. The USA is the world's largest economy and makes up the majority of North America's GDP. Europe is much more segmented with several nations being significant contributors including, Germany, UK, and France. East Asia's GDP distribution is simpler, with China, the world's 2nd largest economy, and Japan, the worlds 3rd largest economy (fluctuates between Germany and Japan), dominating output.

portfolio

projects

3D Open Vocabulary Semantic Segmentation for Robot Navigation

3D Open Vocabulary Semantic Segmentation for Robot Navigation

1. Introduction

VLMaps is a spatial map representation that embeds pretrained visual-language features with a 3D reconstruction and projects to a top-down 2D map. VLMaps embeds pixel-level visual features from an LSeg visual encoder to points in a point cloud. These points are then projected to a top-down navigation map where only the point with the highest height is kept. After this, the visual features are compared through cosine similarity to textual features from a CLIP text encoder to determine the semantic label of the point. Due to the top-down projection, the robot wouldn't be capable of 3D navigation such as “go to the plant below the table.” Addressing this problem was our main goal with this project. We used the Matterport3D Dataset.

Image Processing for Fisheye Camera Image Object Detection

Image Processing for Fisheye Camera Image Object Detection

1. Introduction

I participated in the 2024 AI City Challenge as part of a team from UW. Our team followed an ensemble learning approach where each person performed individual experiments and trained separate detectors. One person focused on finding additional fisheye camera image data, another focused on data augmentation, and I focused on image color transformation to improve detection. Specifically, a subset of the data was black&white images, and my job was to improve performance on them. I transformed my entire dataset to black&white images and trained yolov8 detectors on my transformed data. This resulted in performance improvements on the black&white validation data.

Evaluating Foundation Model Robot Pose Estimation with Synthetic Data Generation

Evaluating Foundation Model Robot Pose Estimation with Synthetic Data Generation

1. Introduction

Position and Orientation or "Pose" is a 4x4 matrix that defines the translation or "position" and rotation or "orientation" of an object. One reason to care about Robot Pose Estimation is because accurate prediction of the two pose matrices for the robot and an object, enables calculation of a "relative grasp" transform that describes how the robot should position itself to grasp the object successfully.
Block Diagram

Image-Captioning Tactical Advisor Model (ICTAM)

Image-Captioning Tactical Advisor Model (ICTAM)

1. Introduction

ICTAM was my first attempt at applying pretrained LLMs to tactical analysis and advisory tasks. This work was inspired by CICERO and LLMs play sc2. I primarily wanted to test the pretrained capabilities of image captioning models and see if they could make accurate tactical judgements with minimal supervised finetuning.

Regularization, Hyperparameter Tuning on Low Rank Autoregressive Models

Regularization, Hyperparameter Tuning of Low Rank Autoregressive Models

1. Introduction

I conducted Regularization and Hyperparameter Tuning Experiments on the Low Rank Autoregressive Models used in this paper: Active Learning of Neural Population Dynamics using two-photon holographic optogenetics Some motivation for why is that there is a strong need for techniques that minimize the amount of data needed to learn neural population dynamics due to experimental time and resource constraints. The long-term goal that the above paper is working towards is for a model to be able to actively learn the patterns of neurons that have the most informative neural responses to quickly learn the neural population dynamics (brain activity).

Evaluating Sensor Fusion SLAM

Evaluating Multisensor-aided Inertial Navigation System (MINS)

1. Introduction

MINS is a multisensor sensor fusion slam system capable of fusing IMU, camera, LiDAR, GPS, and wheel sensors. My goal was to evaluate it on the KAIST Urban Dataset as well as my lab's (at University of Washington) suburban rover dataset. I used kaist2bag to convert the separate sensor bag files to one bag.

publications

talks