- 229
- 85 511
Voxel51
United States
Приєднався 8 чер 2019
Voxel51 is bringing transparency and clarity to the world's data. We build software that enables developers, scientists, and organizations to build high-quality datasets and computer vision models that power some of today's most remarkable machine learning and artificial intelligence.
Get open source FiftyOne: github.com/voxel51/fiftyone
Learn about FiftyOne Teams: voxel51.com/fiftyone-teams/
Get open source FiftyOne: github.com/voxel51/fiftyone
Learn about FiftyOne Teams: voxel51.com/fiftyone-teams/
ECCV 2024: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
In this talk, I will introduce our recent work on open-vocabulary 3D semantic understanding. We propose a novel method, namely Diff2Scene, which leverages frozen representations from text-image generative models, for open-vocabulary 3D semantic segmentation and visual grounding tasks. Diff2Scene gets rid of any labeled 3D data and effectively identifies objects, appearances, locations and their compositions in 3D scenes.
ECCV 2024 Paper: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
arxiv.org/abs/2407.13642
About the Speaker
Xiaoyu Zhu is a Ph.D. student at Language Technologies Institute, School of Computer Science, Carnegie Mellon University. Her research interest is computer vision, multimodal learning, and generative models.
ECCV 2024 Paper: Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
arxiv.org/abs/2407.13642
About the Speaker
Xiaoyu Zhu is a Ph.D. student at Language Technologies Institute, School of Computer Science, Carnegie Mellon University. Her research interest is computer vision, multimodal learning, and generative models.
Переглядів: 84
Відео
ECCV Redux: Zero-shot Video Anomaly Detection: Leveraging LLMs for Rule-Based Reasoning
Переглядів 8821 годину тому
Video Anomaly Detection (VAD) is critical for applications such as surveillance and autonomous driving. However, existing methods lack transparent reasoning, limiting public trust in real-world deployments. We introduce a rule-based reasoning framework that leverages Large Language Models (LLMs) to induce detection rules from few-shot normal samples and apply them to identify anomalies, incorpo...
ECCV 2024 Redux: Day 3- Closing the Gap Between Satellite & Street View Imagery Generative Models
Переглядів 71День тому
Closing the Gap Between Satellite and Street-View Imagery Using Generative Models With the growing availability of satellite imagery (e.g., Google Earth), nearly every part of the world can be mapped, though street-view images remain limited. Creating street views from satellite data is crucial for applications like virtual model generation, media content enhancement, 3D gaming, and simulations...
ECCV 2024 Redux: Day 3- High-Efficiency 3D Scene Compression Using Self-Organizing Gaussians
Переглядів 225День тому
In just over a year, 3D Gaussian Splatting (3DGS) has made waves in computer vision for its remarkable speed, simplicity, and visual quality. Yet, even scenes of a single room can exceed a gigabyte in size, making it difficult to scale up to larger environments, like city blocks. In this talk, we’ll explore compression techniques to reduce the 3DGS memory footprint. We’ll dive deeply into our n...
ECCV 2024 Redux: Day 3- Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Seg.
Переглядів 42День тому
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures We present Skeleton Recall Loss, a novel loss function for topologically accurate and efficient segmentation of thin, tubular structures, such as roads, nerves, or vessels. By circumventing expensive GPU-based operations, we reduce computational overheads by up to 90% compared to the ...
ECCV 2024 Redux: Day 1 - Tree-of-Life Meets AI
Переглядів 53День тому
A central challenge in biology is understanding how organisms evolve and adapt to their environment, acquiring variations in observable traits across the tree of life. However, measuring these traits is often subjective and labor-intensive, making trait discovery a highly label-scarce problem. With the advent of large-scale biological image repositories and advances in generative modeling, ther...
ECCV 2024 Redux: Day 1 - Robust Calibration of Large Vision-Language Adapters
Переглядів 53День тому
We empirically demonstrate that popular CLIP adaptation approaches, such as Adapters, Prompt Learning, and Test-Time Adaptation, substantially degrade the calibration capabilities of the zero-shot baseline in the presence of distributional drift. We identify the increase in logit ranges as the underlying cause of miscalibration of CLIP adaptation methods, contrasting with previous work on calib...
ECCV 2024 Redux: Day 1 - Fast and Photo-realistic Novel View Synthesis from Sparse Images
Переглядів 53День тому
Novel view synthesis generates new perspectives of a scene from a set of 2D images, enabling 3D applications like VR/AR, robotics, and autonomous driving. Current state-of-the-art methods produce high-fidelity results but require a lot of images, while sparse-view approaches often suffer from artifacts or slow inference. In this talk, I will present my research work focused on developing fast a...
Computer Vision Meetup: Deploying ML models on Edge Devices using Qualcomm AI Hub
Переглядів 8314 днів тому
In this talk we address the common challenges faced by developers migrating AI workloads from the cloud to edge devices. Qualcomm aims to democratize AI at the edge, easing the transition to the edge by supporting familiar frameworks and data types. This is where Qualcomm AI Hub comes in. Developers can follow along, gaining knowledge and tools to efficiently deploy optimized models on real de...
Computer Vision Meetup: Human-in-the-loop: Practical Lessons for Building Comprehensive AI Systems
Переглядів 11714 днів тому
AI systems often struggle with data limitations, data distribution shift over time, and a poor user experience. Human-in-the-loop design offers a solution by placing users at the center of AI systems and leveraging human feedback for continuous improvement. In this talk, we’ll dive deeply into a recent project at Merantix Momentum: A interactive tool for automatic rodent behaviour analysis in v...
Computer Vision Meetup: Curating Excellence: Strategies for Optimizing Visual AI Datasets
Переглядів 10414 днів тому
In this talk Harpreet will discuss common challenges plaguing visual AI datasets, their impact on model performance, and share some tips and tricks for curating datasets to make the most of any compute budget or network architecture. Speaker: Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG...
Computer Vision Meetup: PostgreSQL for Innovative Vector Search
Переглядів 64Місяць тому
There are a plethora of datastores that can work with vector embeddings. You are probably already running one that allows for innovative uses of data alongside your embeddings - PostgreSQL! This talk will focus on showing examples of how features already present in the PostgreSQL ecosystem allow you to leverage it for cutting edge use cases. Live demos and lively discussion will be the focus of...
Computer Vision Meetup: Pixels Are All You Need Utilizing 2D Image Representation in Robotics
Переглядів 227Місяць тому
Many vision-based robot control applications (like those in manufacturing) require 3D estimates of task-relevant objects, which can be realized by training a direct 3D object detection model. However, obtaining 3D annotation for a specific application is expensive relative to 2D object representations like segmentation masks or bounding boxes. In this talk, Brent will describe how we achieve mo...
Computer Vision Meetup: Accelerating Machine Learning Research and Development for Autonomy
Переглядів 242Місяць тому
At Oxa (Autonomous Vehicle Software), we designed an automated workflow for building machine vision models at scale from data collection to in-vehicle deployment, involving a number of steps, such as, intelligent route planning to maximise visual diversity; sampling of the sensor data w.r.t. visual and semantic uniqueness; language-driven automated annotation tools and multi-modal search engine...
Computer Vision Meetup: Using Elasticsearch Vector Search in FiftyOne
Переглядів 112Місяць тому
In this short demo, Steve Pousty (Developer Advocate at Voxel51) shows you how to leverage Elastic’s vector search search capabilities for computer vision use cases using the FiftyOne open source library. Not a Meetup member? Sign up to attend the next event: voxel51.com/computer-vision-ai-meetups/ Recorded on Oct 10, 2024 at the AL, Machine Learning and Computer Vision Meetup. #computervision ...
Computer Vision Meetup: Elastic is for the Birds: Identifying Embedding Images using Vector Search
Переглядів 100Місяць тому
Computer Vision Meetup: Elastic is for the Birds: Identifying Embedding Images using Vector Search
Computer Vision Meetup: RGB-X Model Development: Exploring Four Channel ML Workflows
Переглядів 111Місяць тому
Computer Vision Meetup: RGB-X Model Development: Exploring Four Channel ML Workflows
Computer Vision Meetup: How Renault Leveraged Machine Learning to Scale Electric Vehicle Sales
Переглядів 140Місяць тому
Computer Vision Meetup: How Renault Leveraged Machine Learning to Scale Electric Vehicle Sales
Computer Vision Meetup: GPUs at Scale - Trials of a GPUaaS Provider
Переглядів 842 місяці тому
Computer Vision Meetup: GPUs at Scale - Trials of a GPUaaS Provider
Visual AI in Healthcare: NVIDIA’s VISTA-3D and MedSAM-2 Medical Imaging Models
Переглядів 4812 місяці тому
Visual AI in Healthcare: NVIDIA’s VISTA-3D and MedSAM-2 Medical Imaging Models
Visual AI in Healthcare: Exploring Instance Imbalance in Medical Semantic Segmentation
Переглядів 952 місяці тому
Visual AI in Healthcare: Exploring Instance Imbalance in Medical Semantic Segmentation
Visual AI in Healthcare: Advancing Comparative Computational AI in Veterinary Oncology
Переглядів 1382 місяці тому
Visual AI in Healthcare: Advancing Comparative Computational AI in Veterinary Oncology
Visual AI in Healthcare: Interpretable AI Models in Radiology
Переглядів 2362 місяці тому
Visual AI in Healthcare: Interpretable AI Models in Radiology
Computer Vision Meetup: It's in the Air Tonight. Sensor Data in RAG
Переглядів 1492 місяці тому
Computer Vision Meetup: It's in the Air Tonight. Sensor Data in RAG
Computer Vision Meetup: Data-Centric AI Competition on Hugging Face Spaces
Переглядів 562 місяці тому
Computer Vision Meetup: Data-Centric AI Competition on Hugging Face Spaces
Computer Vision Meetup: Reducing Hallucinations in ChatGPT and Similar AI Systems
Переглядів 1472 місяці тому
Computer Vision Meetup: Reducing Hallucinations in ChatGPT and Similar AI Systems
Computer Vision Meetup: Accelerating Multimodal RAG Pipelines with NVIDIA and OSS Integrations
Переглядів 1752 місяці тому
Computer Vision Meetup: Accelerating Multimodal RAG Pipelines with NVIDIA and OSS Integrations
Computer Vision Meetup: 5 Handy Ways to Use Embeddings, the Swiss Army Knife of AI
Переглядів 772 місяці тому
Computer Vision Meetup: 5 Handy Ways to Use Embeddings, the Swiss Army Knife of AI
Computer Vision Meetup: Agentic RAG in 2024
Переглядів 5272 місяці тому
Computer Vision Meetup: Agentic RAG in 2024
Amazing work!
WOW. This looks so amazing. I can't wait to use this!
Really interesting
Thank you for sharing the video. Does this plugin assume a vector engine like qdrant is used as backend?
Thank you for sharing this video on the Active Learning plugin. Is it possible to use the plugin for multi-class multi-label tasks as well?
When I try the dev install process in a git bash terminal, it fails at a point because of a package error. "Collecting shapely>=1.7.1 (from -r requirements\extras.txt (line 7)) Using cached shapely-2.0.6-cp312-cp312-win_amd64.whl.metadata (7.2 kB) ERROR: Could not find a version that satisfies the requirement open3d>=0.16.0 (from versions: none) ERROR: No matching distribution found for open3d>=0.16.0" How can this be solved?
Very interesting demo; would you mind sharing the Colab link?
ok, and how does one start it?
This was very helpful! Llama Index grows so fast, it feels overwhelming for a beginner.
!second comment
I want to work with my custom dataset. I'd like you to show me how to do it and which benefits I can get using your product. Examples, how can I refine my own data with fiftyone
Isn't the "Grid Trick" similar to using ControlNet, a type of model for controlling image diffusion models by conditioning the model with an additional input image?
Love your product!
How to we execute the plugin logic in the code? This doesn't seem to work: logging.info("removing approximate duplicates") operator_uri = "@jacobmarks/image_deduplication/remove_all_approximate_duplicates" params = { "sim_choices": "sim", # You may need to adjust this based on your similarity run key "threshold_value": 0.4 } # Create an invocation request request = foe.InvocationRequest(operator_uri, params=params) # Create an executor and execute the request executor = foe.Executor(requests=[request]) result = executor.trigger(operator_uri, params=params) print(result.to_json()) # logging.info(f"Found approximate duplicates: {result.result}") return result
The video was great, thanks mate for explination.
Wonderful 👍
Great!
the search result are only online images? or can it be local images?
you can drag and drop a local image in :)
Made that look *way* too easy. I spent a whole hour last night trying to get the first line of code to work! It was because my Python paths were thrown about the place
ahahahaha) me too)
How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help
How to build the js part of code to generate umd.js file in dist folder. I am build using yarn build but the generated umd file is not working and not opening new panel. Please help
Great question. Try `yarn install` as well. Make sure that the plugin is in your plugins directory. And when you want to change the plugin, make sure you use `yarn dev`. If you have more questions about FiftyOne Plugins, check out the #plugins channel in the FiftyOne community Slack! slack.voxel51.com/
great tutorial, can you use a local instance of SD?
It is great that you give us a list of next steps, but a link to each of these points would have been nice!
nice job!
This is good! But i believe the data should also grab eye movement. Eye movement is crucial to map intention and will aid in robot navigation. Apple's headset has the hardware to monitor both eye direction and head direction.
I have a idea for build a autonomous drone using computer vision to detect objects that is labled with a GPS location before.
please slide share?
How does it select which images would be kept as "representatives" and which removed?
I want words like these intitle:"keyword" For better search efficiency for topics
Is it possible to use this and find the most similar image given user submitted photos? For example I'm trying to do something to detect trading cards, where the input would be photos of cards submitted by users.
absolutely!
Hello, I downloaded and installed FiftyOne, but I don’t know how to use it. All your videos didn’t explain how to use it.
There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)
There is lots of documentation online on their website, check it out! Its really not difficult to get it running, but its "only" an API, so some python Experience is definetly helpful to get it running. :)
I love this
good
Thanks for clear explanation❤
Thank you so much bro. Nice tutorial.
getting Not Found
Thanks, great overview
Look promising, I was going through your tutorial, and I was hoping to see how you can import your own database.
🤩 Promo'SM
Can you perform the initial labeling on images that have not been annotated yet? On part 5 and I have not seen that information yet. Did I miss it?
Can you edit/correct or add/remove annotations directly in FiftyOne?
I am really excited about this product! Thank you for this hands-on video!
"Wow, this video is incredibly informative and well-produced! The speaker does a fantastic job of explaining the complex topic of speech recognition and the new Whisper model from OpenAI in a way that's easy to understand. Great job, highly recommended to anyone interested in this field!"
splendid 🙂✌️️️!! Find out how your competition ranks better = 'Promosm'!!
how to add our own dataset into FiftyOne. I want to label my own data.
As mentioned in the video, fiftyone isn't a classical annotation tool, but it provides hooks to do that with cvat, labelbox etc and then load the labeled data back into fiftyone. For me the cvat solution worked perfectly fine. Everything is perfectly documented on their website, check it out! :) If you want to load your annotation data which is in your own format, and not in a typical dataformat (COCO,...) you'll have to write a few lines of python codes yourself. For that purpose I have implemented a DatasetHandler-class. You'll have to convert into fiftyone-format by iterating through your data and turn them into fiftyone Detection-Objects: detections.append( fo.Detection(label=my_label, bounding_box=my_bbox) ) Fiftyone doesn't work "out of the box", but it's a great tool for working with CV-Data!
Hi I am getting the following error in colab and jupyter notebook with custom data and coco 2017 (default data) MalformedQueryException: Cannot attach/detach dataset to/from a batch project Kindly help me to solve this issue
𝐩яⓞ𝓂𝓞Ş𝐦
Hi guys, how are you? How to change the font on the interface of fiftyone, I hope to get your reply!
A installation tutorial would be nice