Practice Activity - remove duplicate rows in SQL Server (three different ways)
Вставка
- Опубліковано 17 вер 2024
- Duplicate rows are very annoying. But how can you remove them in SQL Server?
My SQL Server Udemy courses are:
70-461, 70-761 Querying Microsoft SQL Server with T-SQL: rebrand.ly/que...
98-364: Database Fundamentals (Microsoft SQL Server): rebrand.ly/dat...
70-462 SQL Server Database Administration (DBA): rebrand.ly/sql...
Microsoft SQL Server Reporting Services (SSRS): rebrand.ly/sql...
SQL Server Integration Services (SSIS): rebrand.ly/sql...
SQL Server Analysis Services (SSAS): rebrand.ly/sql...
Microsoft Power Pivot (Excel) and SSAS (Tabular DAX model): rebrand.ly/mic...
----
Duplicate rows may lead to erroneous conclusions, so often you will want them to be deleted. But how can you identify them?
In this video, we'll have a look at three different ways at how we can get rid of them, using DISTINCT, GROUP BY, and UNION - but which is better?
The starting code is:
DROP VIEW IF EXISTS WithDuplicates
GO
CREATE VIEW WithDuplicates AS
SELECT *
FROM sys.objects
UNION ALL
SELECT *
FROM sys.objects
SELECT object_id, [name]
FROM WithDuplicates
ORDER BY object_id
The code that I used in this video is:
SELECT DISTINCT object_id, [name]
FROM WithDuplicates
ORDER BY object_id
SELECT object_id, [name]
FROM WithDuplicates
GROUP BY object_id, [name]
ORDER BY object_id
SELECT object_id, [name]
FROM WithDuplicates
UNION
SELECT object_id, [name]
FROM WithDuplicates
ORDER BY object_id
This video is great, thank you for your effort! However, you’re showing us a way to filter the duplicates, not remove them. Maybe you could make another video on how to remove duplicate rows from the database?
Hi vas. Thank you for the suggestion. I'll do a related video, marking them as to be deleted. It's then down to you to delete them if you want. Phillip
Super practice activity Philip as always! Its worth mentioning that numbers (results) may be different depending on what database we will create our view because sys.objects can have different numbers of objects inside. Here is my query before I watch this video, some of them are super silly but the result is ok I mean without duplicates :-
- to see duplicates
select count(object_id)as [Nr]
,object_id
FROM WithDuplicates
group by object_id
having count(object_id) >1
order by Nr desc
-- to exclude duplicates
SELECT
distinct(object_id) , [name]
FROM WithDuplicates
ORDER BY object_id
-- to exclude duplicates
SELECT object_id, [name]
FROM WithDuplicates
intersect
SELECT object_id, [name]
FROM WithDuplicates
ORDER BY object_id ;
-- to exclude duplicates
WITH CTE_duplicates as (
select object_id, [name]
, ROW_NUMBER () over (partition by object_id order by name) as [Dup]
FROM WithDuplicates
) select object_id, [name] from CTE_duplicates where dup = 1
-- to exclude duplicates
select object_id, [name]
,GROUPING_ID (object_id, [name])
FROM WithDuplicates
group by object_id, [name]
ORDER BY object_id
-- to exclude duplicates
drop table if exists #temp_result
select object_id, [name]
into #temp_result
FROM WithDuplicates
group by rollup( object_id, [name])
ORDER BY object_id
select * from #temp_result
where name is not null
;
-- to exclude duplicates
drop table if exists #temp2 ;
with cte_X as (
select object_id, [name]
,lead (object_id , 2) over (order by [name] desc) as [lead]
FROM WithDuplicates
) select distinct(lead) ,object_id, [name]
into #temp2
from cte_X
select object_id, [name] from #temp2
ORDER BY object_id
Hi Jacekk7819. Wow - that's an overall lots of ways. Well done. The key is to have all of these ways in your mind, and then when another problem happens, you have all of these tools at your disposal to solve it. That's great. Phillip
I know this an old video, but i love how you go into the why? Could you possibly make more videos regarding this?
Hello Beyala. What “why” are you referring to? Phillip
Great Video!
Can we get RAW DATA used in video....
It will be very beneficial if u can share it with us.
Your explanation is hard to follow. For a learner, it's impossible to keep track of what you're saying.
Hi User. It's a tricky subject, and may be at a more advanced level than you currently are. However, I would urge you, if you want to get to have a higher level, to get some training, whether it is by video course or book form. Keep learning! Phillip
@SQLServer101 Thank you for your feedback. What book would you recommend?
Hi User. I would have a look at "Querying Microsoft SQL Server 2012 Exam 70-461 Training Kit". It's very cheap on the Kindle version. Phillip