Adding "sort by columns" to Reports

Publicat
2010-10-25 14:04
Written by

This has come up a couple of times before. According to my chat with Lobo today, not much is happening on this front. Since our client at Two Mice and a Strawberry has a need for this, we're diving in head first this week. Hopefully we'll have something that's worth contributing to core. Please comment below if you have any thoughts on the ideas presented here.

Our intent is to take the order_bys parameter that's mentioned passingly in the docs and build on it as follows.

 

The general idea

  • any column available for display is also available for sorting
  • additionally, columns defined in order_bys are also available for sorting
  • add as many sort columns as you wish (Correction: as many as you want, as long as you don't want more than 10), ascending or descending for each
  • any sort column can be marked as "create section header", in which case section headers with total counts are inserted at the top of each section

 

Mockups

A couple of quick mockups show how this might look to the user:

1. Report criteria form showing "Order by Columns" section:

Report criteria mockup

2. Report body showing output based on above critera:

Report criteria mockup

Potential challenges

  • It will be hard to get the section headers to play nicely with header-click table sorting; from a logical viewpoint it can be nonsensical, and from a technical viewpoint it can be pretty complex.  We're considering either disabling header-click sorting entirely when section headers are used, or else removing them immediately when the table sort is changed by a header click.
  • We anticipate some headaches related to sorting, but at the moment this is more a fear of the unknown than any specific concern.

 

Conclusion

 As I said, any feedback from the community would be really great.  Since our client needs this pretty quickly, we'll probably come up with a solution that works for our immediate needs and then spend time afterward getting it generalized properly for core.  I'm looking forward to your comments.

Filed under

Comments

By sortable, you mean grouped by ?

 

eg when you "sort by" Events, mean you generate a "sub chapters" for each event ?

 

Sorry, confused, what's the difference between what you suggest and the grouped by ?

 

As for the sorted per se, wouldn't it be easier to generate (already sorted if you think perf is an issue) and add as a param to the table plugin to sort by this column (eg so you can still use it to sort by another column if you want to);

 

X+

The "Group by" settings create an SQL "GROUP BY" clause, which basically says: if there is more than one row with the same value in column X, only show one of those rows.

 

What we're doing here is focused on the SQL "ORDER BY" clause, allowing the user to order by one or more columns, with any mix of ASC/DESC keyword.

 

We're also looking to allow inserting section headers into the table output when the value of a given column changes from one row to the next.  In long reports, it can be helpful to break things into sections.  For example, if you're sorting a list of people by country, gender, and last_name, it can be very helpful to have section headers for each country, and within each country, section headers for each gender.  This can be preferable to having 80 rows that have "Albania" in the first column and "M" in the second column, followed by 70 rows with "Albania" in the first column and "F" in the second column.  Because sorting -- and thus the optional section headers -- can be defined for columns that don't display, the user has the choice whether to display the columns for country and gender, or instead to save a little space and just show that info in the section headers (see the mockup: the event column is not displayed, but the event is named in the section header).

 

Since sections are tied to sorting, you can pretty easily calculate the switch from one value to the next, and insert a header in that spot.  And since the section header is optional, you can insert section headers only for the fields you want, or not at all.

 

Another nice thing about sections is the ability to show, in the section header, the total number of rows contained in that section.  So in the country/gender/last_name example, we would see "150" showing as the total row count for Albania in its section header, and 80 and 70 showing in the section headers for M and F, respectively.  This can can be pretty useful when dealing with long reports.

A big thanks for stepping up to the plate on this.

There are two places where the order of the fields in the sort get defined: in code, and in the browser.

I'm a bit confused by the order_by versus sort by - aren't they the same? I think you likely meant group_bys and order_bys.

If you try to let any order by field take sorting precedence over group by fields you'll run into problems: you should take that precedence as a design constraint both in code and in the UI.

I'm also a bit confused as to what is meant by the check box at the right of order by fields versus the functionality provided by group by, eg event field. Is it that you want the checkbox to give a heading, while the group by currently is designed to provide a tally whenever the value changes, if I'm not mistaken? My sense is that you might be better off adding options to group by's for things like making it a header rather than a column, rather than making it an order by option.

From a UI perspective, I think that there are two or three ways to go to specify the order of columns for sorting, and then whether ascending or descending:

  • 1. Use a popup interface similar to the Excel one which lets you specify the order of the fields used in the sort, and what type of sort (eg ascending or descending).
  • 2. Use a drag and drop interface to put the columns in sort order of precedence from left to right, as well as making them clickable for sorting (phpMyAdmin might have some code you can repurpose if you go this way).
  • 3. Use a drag and drop, or perhaps ordered include / exclude widget (like the one that adds and removes components or enabled countries), to set the sort order in an area of the screen other than the columns. I'm not clear on how you would deal with ascending / descending elegantly on this approach.
  • I think that 1 is clearest, and is familiar to users. 2 is not as good functionally since you sometimes want things ordered by certain columns that should be pushed way to the right for display because you don't care about their content except to help in the ordering. 3 is a bit of 1 - choosing sort order separately from column order - and a bit of 2 - doing things on the form with a kind of drag and drop rather than in a pop-up. Might be better, might be worse, depending on implementation. 1 seems pretty quick to implement.

HTH,

Joe

"any column available for display is also available for sorting"

 

It would be good to limit the ability of a report writer to take down the database.  You could limit it to only columns that have an index.  That's a list of known columns, plus custom fields that are "searchable". 

 

"as many as you want, as long as you don't want more than 10"

 

Again, for damage control you might want to limit that to 3-5. 

Thanks dalin.  These are good points.  3-5 columns is probably more than enough in most cases.  It might be wiser to limit the "order by" options only to those specified in order_bys, and let the report developer decide what goes in there, without insisting that all display columns be included.  Letting the code check for indexes and insist on sorting only by indexed columns would be safer, but I would rather not insist on it and let the report developer decide.

I think we need to be careful with SQL terminology vs end-user terms.

 

With SQL "group by", the detail rows are not shown in the result set. What most end-users want is the detail rows, with what I am going to call "report breaks" . To me what many end-users ask for is "show me all the records for x, with a visual break whenever a  new category is reached. 

 

In the screen print above, the light blue row with the text "Event: Fall Fundraiser Dinner(12)" is what I would describe as a "report break" followed by 12 detail records.

 

When people use "report breaks" in their reports, allowing the end-user to re-sort the table by clicking a column heading will produce nonsense in most/many cases.

 

I like the idea of having a report parameter called "order by" or something similar to allow the end-user to do what they want, without the danger of producing nonsense. 

Sarah's comment (above yours) explains it a little better, perhaps.  The sum is only displayed within the section header -- or what Sarah is calling a "report break."  And this section header is only displayed if the "create section header" check-box is checked.  So defining an "order by" column won't always "display sum."

In a previous reporting solution I was involved with, we boiled down the options for the end-user as follows:

------

Do you want report breaks? yes/no

--------

If yes is chosen, ask which column values should breaks be tied to. If there are 15 fields in the query, present the user with a check box for each field. ( In an event income report, the end-user may want a report break every time there is a new ContributionType, and a new month. So within the section for "event fee A" there would be a sub-section for each month. Then for "event fee B", there would be a sub-section for each month, and so on)

 

If the end-user chooses "yes" to a report break for any field, then ask them what they want to display at the report break. It could just be a visual spacer, or just the field value that caused the report break, or it could include some subtotals, averages, counts, etc. 

 

 In the case of that other reporting tool, if the end-user wanted counts, subtotals, etc within a report break, the code that looped and created the array to send to the browser, also did the report break calculations for subtotals, counts, etc.  The SQL code did not calculate those report break numbers.  ( Since we had to loop through all the detail records anyhow, it was not any extra work to count records, or get a subtotal. ) 

That's pretty much what we're doing here, except that in this design (as it stands now) the options are not so varied as to what a section break does, only whether to have a section break or not, and then section breaks include, by design, the field label, field value, and section total.

 

Would be nice in the future to allow more configurability in the section headers.  Even at this point, we're considering making the "section total" count optional, to improve performance when that number isn't really needed.

This may not be the appropriate place for this comment, so feel free to shift it. 

 

As a companion to improving the sorting behaviour of reports, I would like to see the filtering options for reports improved.  It would be really useful to be able to filter by any "Type " column that is displayed in the report.  By "type" column I really mean any column that contains predefined options eg. Event Type, member type, participant status (for events), Payment instrument, contribution type etc.

Hi Does anyone know if this has been developed or what stage it is at. This would be a huge benifit and we might be able to contribute depending of price and speed of dev.
Thanks

this is not part of the 3.4/4.0 series

If you can help "make it happen" (similar to http://civicrm.org/mih) and/or be the lead sponsor that would be great. To add it in a generalized manner and extend 5-10 reports is expected to take 25-40 hours. We suspect we can do this in a future release since it does not involve changing the schema

ping us on irc / email / forums if u'd like to take this forward

I'm a system administrator and it would be wonderful if we could have a sort by column feature to set as a standard on certain reports. As an example, we have an event scheduled and I would like to create an Attendee List report based on registration date (most recent at the top) and make it available as a dashlet. The only way this report exists today (that I can determine) is alphabetic by participant name. When I try to click on the column title "Registration Date" it does not change the sort order.

I can see numerous ways this would be helpful for us. Any sense of whether this is complete or on a list would be appreciated.