Get Max Count from Group by
I have trouble to get a output for group function in sql.Below are the details for table
I have 1 table table name "checks" have 2 columns pid,cid
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 PID                                                VARCHAR2(20)
 CID                                                VARCHAR2(20)
Below are rows available
select * from checks;
PID                  CID
-------------------- --------------------
p1                   c1
p1                   c1
p1                   c2
p1                   c2
p1                   c2
p2                   c1
p2                   c1
p2                   c1
p2                   c1
p2                   c1
p2                   c1
p2                   c2
p2                   c2
p2                   c2
p2                   c2
p2                   c2
P represents participants and c represents category
question
I need to know which participant participate more than one category in that which category participant participate maximum.(for every participants)
Expected result:
pid   cid    count(cid)
---   ---    -----------
p1    c2         3
p2    c1         6
Assuming a database system (you haven't specified one, but I suspect Oracle?) that supports windowing functions and CTEs, I'd write:
;With Groups as (
    select pid,cid,COUNT(*) as cnt from checks group by pid,cid
), Ordered as (
    select pid,cid,cnt,
       ROW_NUMBER() OVER (PARTITION BY pid ORDER BY cnt desc) as rn,
       COUNT(*) OVER (PARTITION BY pid) as multi
    from Groups
)
select pid,cid,cnt
from Ordered
where rn = 1 and multi > 1
 The first CTE ( Groups ) just finds the counts for each unique cid,pid combination.  The second CTE ( Ordered ) assigns row numbers to these results based on the count - with the highest count assigned a row number of 1. We also count how many total rows have been produced for each pid .  
 Finally, we select those rows which were assigned a row number of 1 (the highest count) and for which we obtained multiple results for the same pid .  
Here's an Oracle fiddle to play with. And here's an SQL Server version (and thanks to Andriy M for producing the Oracle one)
This will give you some basic ideas:
And the results shown below. Also since p1 participated in more than one Category so p1 will come with each new category in a different row when we use: 'group by PID,CID'
Step by step:
 First, get the counts of rows per (PID, CID) .  This is simple:  
SELECT
  PID,
  CID,
  COUNT(*) AS cnt
FROM checks
GROUP BY
  PID,
  CID
And you get this result set for your example:
PID  CID  cnt
---  ---  ---
p1   c1   2
p1   c2   3
p2   c1   6
p2   c2   5
 Now, throw in COUNT(*) OVER (PARTITION BY PID) to return the number of categories per person:  
SELECT
  PID,
  CID,
  COUNT(*) AS cnt,
  COUNT(*) OVER (PARTITION BY PID) AS cat_cnt
FROM checks
GROUP BY
  PID,
  CID
 The OVER clause turns a "normal" aggregate function COUNT() into a window aggregate function.  That makes the COUNT(*) operate on the grouped row set rather than the source one.  So, COUNT(*) OVER ... in this case counts rows per PID , which for us has the meaning of category counts per person.  And this is the updated result set:  
PID  CID  cnt  cnt_cat
---  ---  ---  -------
p1   c1   2    2
p1   c2   3    2
p2   c1   6    2
p2   c2   5    2
 One more thing left is to rank the cnt values per PID .  This may be tricky as there may be ties at the top counts.  If you always want a single row per PID and are perfectly indifferent to which CID, cnt will be in case of a tie, you can modify the query like this:  
SELECT
  PID,
  CID,
  COUNT(*) AS cnt,
  COUNT(*) OVER (PARTITION BY PID) AS cat_cnt,
  ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn
FROM checks
GROUP BY
  PID,
  CID
And this is what the result set will look like:
PID  CID  cnt  cnt_cat  rn
---  ---  ---  -------  --
p1   c1   2    2        2
p1   c2   3    2        1
p2   c1   6    2        1
p2   c2   5    2        2
 At this point, the results contain all the data necessary to produce the final output, you just need to filter on cnt_cat and rn .  However, you cannot do that directly.  Instead, use the last query as a derived table, be it a WITH table expression or a "normal" subselect.  Below is an example using WITH :  
WITH grouped AS (
  SELECT
    PID,
    CID,
    COUNT(*) AS cnt,
    COUNT(*) OVER (PARTITION BY PID) AS cat_cnt,
    ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn
  FROM checks
  GROUP BY
    PID,
    CID
)
SELECT PID, CID, cnt
FROM grouped
WHERE cat_cnt > 1
  AND rn = 1
;Here's a SQL Fiddle demo (using Oracle): http://sqlfiddle.com/#!4/cd62d/8
 To expand a bit more on the ranking part, if you still want to return a single CID, cnt per PID but would prefer to have more control on what row should be decided as the "winner", you'll need to add a tie-breaker to the ORDER BY clause of the ranking function.  As an example, you could modify the original expression,  
ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rnwith this one:
ROW_NUMBER() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC, CID) AS rn Ie the tie-breaker is CID , and so of the two or more CID s with the top count, the one that sorts before the others wins.  
 Still, you may want to decide to return all the top counts per PID .  In that case, use either RANK() or DENSE_RANK() instead of ROW_NUMBER() (and no tie-breaker), ie like this:  
RANK() OVER (PARTITION BY PID ORDER BY COUNT(*) DESC) AS rn上一篇: 与Heroku的Django部署
下一篇: 从组中获取最大计数
