Historical cross-tables archive at CFC site now seems complete back to 1996!

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Historical cross-tables archive at CFC site now seems complete back to 1996!

    Just checking the CFC site today, I note that there has been an enormous addition to the historical cross-tables archive file. Back to 1996 now! Thanks and congratulations!!

  • #2
    Yes, indeed, the historical event/ratings data went up a few days ago. Thanks all for your patience as there were some unexpected challenges recovering the events/ratings data. The data was mostly intact, especially in recent years, but amongst the 24,109 tournaments with 355,651 players about 100 or so fixes were needed.

    The data was split over 4 sources: Drupal’s MySQL database (used by the old website) and 3 Microsoft Access tables. Drupal had the most data but was still missing events, missing event data, and/or had incorrect data. After noticing some anomalies, I wrote programs to scan through all the data to find differences. Drupal was missing 47 events around 2000, including the 1999 Pan Am Open. The Pan Am had 139 players in 5 sections including a Frank Dixon who played in the top section. All 47 events have now been recovered and are visible again on the website after a 10+ year hiatus.

    Amongst the other differences, the most common was differences in player CFC ids. The easiest to resolve were duplicate CFC ids assigned to one person; easy since the duplicates were marked with “zzz” in the players names. Others took a little sleuthing such as a couple of cases where the correct CFC id was not immediately obvious. In one such case, one player had played in only BC and Alberta events and the other played in only Ontario events. Since the event in question was held in BC, it was possible to pick the most probable CFC id.

    When I started, I didn't know there would only be about 100 fixes but I enjoyed the puzzle and believe the delay was worthwhile. After a good scrubbing, the ratings data is as clean as it can be ready for another 25 years.

    Comment


    • #3
      Originally posted by Don Parakin View Post
      Yes, indeed, the historical event/ratings data went up a few days ago. Thanks all for your patience as there were some unexpected challenges recovering the events/ratings data. The data was mostly intact, especially in recent years, but amongst the 24,109 tournaments with 355,651 players about 100 or so fixes were needed.

      The data was split over 4 sources: Drupal’s MySQL database (used by the old website) and 3 Microsoft Access tables. Drupal had the most data but was still missing events, missing event data, and/or had incorrect data. After noticing some anomalies, I wrote programs to scan through all the data to find differences. Drupal was missing 47 events around 2000, including the 1999 Pan Am Open. The Pan Am had 139 players in 5 sections including a Frank Dixon who played in the top section. All 47 events have now been recovered and are visible again on the website after a 10+ year hiatus.

      Amongst the other differences, the most common was differences in player CFC ids. The easiest to resolve were duplicate CFC ids assigned to one person; easy since the duplicates were marked with “zzz” in the players names. Others took a little sleuthing such as a couple of cases where the correct CFC id was not immediately obvious. In one such case, one player had played in only BC and Alberta events and the other played in only Ontario events. Since the event in question was held in BC, it was possible to pick the most probable CFC id.

      When I started, I didn't know there would only be about 100 fixes but I enjoyed the puzzle and believe the delay was worthwhile. After a good scrubbing, the ratings data is as clean as it can be ready for another 25 years.
      Thank you so much Don for all you have done and all you are doing to recover our website from the mess that it was in just a few months ago. You are a miracle worker.

      Comment


      • #4
        Don Parakin for Prime Minister of Canada!!

        Seriously, I am wondering two things:
        1) I note that the TD spaces for the events listed on the rebuilt site are mostly blank. These were filled in before. Is there a plan to renew that information? As an organizer, I believe this is important.
        2) Is there a plan to go back further in time, before 1996, with cross-tables archive work, in the CFC's plans?

        Comment


        • #5
          Originally posted by Frank Dixon View Post
          Don Parakin for Prime Minister of Canada!!
          I sincerely and profusely apologize for any horrible thing I did to you to make you wish that upon me ;)

          Originally posted by Frank Dixon View Post
          the TD spaces for the events listed on the rebuilt site are mostly blank.
          Almost there; coming soon. MS Access (which the Ratings program uses) has both organizer & TD (arbiter). Drupal (which the old website used) had just one; probably a merging of the two. The new database now has both and will soon display both on the web pages. It is important to give credit to both the Organizer & TD.

          Originally posted by Frank Dixon View Post
          Is there a plan to go back further in time, before 1996
          No plans at the moment because the pre-1996 data is not available in electronic form and is probably not available in paper form either. Around 1997, Troy Vail computerized much of the Business Office, including the calculation of ratings. That's about when tournament reports started being captured electronically. Bob Gillanders tells me that when they were shutting down the CFC Business Office in Ottawa about 10 years ago, there were cabinets full of index cards (paper) with pre-1996 tournament results and hand calculations of ratings. These were sent to a storage facility, but were probably thrown out later after the rent payments were stopped. :(
          Last edited by Don Parakin; Friday, 12th February, 2021, 06:43 PM.

          Comment


          • #6
            I recall Michael von Keitz and maybe Hal Bond went to Ottawa at one point to sort out the contents of a locker. This might have been when I was masters rep which would have been 2011-2013. They were probably not equipped to recover cabinets of paper.

            Comment


            • #7
              To Don and Vlad: thanks for replies to my post!
              1) No offense was intended towards Don; I was merely suggesting, in jest, that with his extraordinary skills on this file, Canada would be better off if he were to be in charge of the country!
              2) Concerning Vlad's reply, in view of the high importance of this particular matter, I will formally submit this question to Vlad, as CFC President, and to the CFC Executive, for further formal investigation, on the record. There may be no way now to retrieve this historical cross-tables information, but if there is, that avenue should certainly be pursued. I think we must know for sure. Since Michael von Keitz and Hal Bond are still active and involved with Canadian chess, I would like to see Vlad contact them now concerning this, and report back. Present and future Canadian chess historians will be grateful. Major Canadian events should have been archived permanently, one would expect.
              Respectfully,
              Frank Dixon
              NTD, Kingston

              Comment


              • #8
                I can't help with the more recent stuff, but I was told many years ago that in 1996 when the CFC had gone to their first online computer system (the first one that used MS Access I believe) they had tossed tons of old records, some of it so old that it was on 8" floppy disks. I was rather horrified, as a digital pack-rat - it would have been relatively easy to condense those onto some CDs or something.
                Christopher Mallon
                FIDE Arbiter

                Comment


                • #9
                  Originally posted by Christopher Mallon View Post
                  I can't help with the more recent stuff, but I was told many years ago that in 1996 when the CFC had gone to their first online computer system (the first one that used MS Access I believe) they had tossed tons of old records, some of it so old that it was on 8" floppy disks. I was rather horrified, as a digital pack-rat - it would have been relatively easy to condense those onto some CDs or something.
                  Its never too late to throw stuff out. Just like I tell my students its never too late to resign. I don't have any eight inch drives but I do have lots of old computer parts that could be resurrected if needed.

                  I don't think we can assign a high priority to recovering this data now. The best bet for historically significant tournaments would be through old CFC bulletins but our first priority has to be in making it through these trying times. On a CFC forum post, I mention Warren Buffett's 25/5 rule. Make a list of your 25 top goals and priorities in ranked order. Draw a line under number 5 and never work on number six through 25. I am not even sure that this would make my top 25 though I am certain that it does not make the top 5 priorities.

                  We can stipulate to the fact that mistakes were made in the past. Not sure that it is possible, nor very helpful to dwell on those mistakes. Certainly unless there is a cache of disks somewhere, we are out of luck on this. When we have a turnover of members of between 25% and 35% yearly, going back 25 years with our data seems quite good.

                  Comment


                  • #10
                    Originally posted by Vlad Drkulec View Post

                    Its never too late to throw stuff out. Just like I tell my students its never too late to resign. I don't have any eight inch drives but I do have lots of old computer parts that could be resurrected if needed.

                    I don't think we can assign a high priority to recovering this data now. The best bet for historically significant tournaments would be through old CFC bulletins but our first priority has to be in making it through these trying times. On a CFC forum post, I mention Warren Buffett's 25/5 rule. Make a list of your 25 top goals and priorities in ranked order. Draw a line under number 5 and never work on number six through 25. I am not even sure that this would make my top 25 though I am certain that it does not make the top 5 priorities.

                    We can stipulate to the fact that mistakes were made in the past. Not sure that it is possible, nor very helpful to dwell on those mistakes. Certainly unless there is a cache of disks somewhere, we are out of luck on this. When we have a turnover of members of between 25% and 35% yearly, going back 25 years with our data seems quite good.
                    I'm not sure why you quoted my post when yours is mainly a response to Frank Dixon, but are you seriously saying that - because COVID or because Buffet says so - you can't be bothered to send a simple email to two people asking if there were any paper cards or other data in the storage unit and confirming that they were likely disposed of?

                    That's all he asked you to do, he didn't ask you to spend hundreds of hours going over old data. There are these handy people called volunteers - several people in this thread I'm sure are interested enough to go over that data if it still existed.
                    Christopher Mallon
                    FIDE Arbiter

                    Comment


                    • #11
                      Originally posted by Christopher Mallon View Post

                      I'm not sure why you quoted my post when yours is mainly a response to Frank Dixon, but are you seriously saying that - because COVID or because Buffet says so - you can't be bothered to send a simple email to two people asking if there were any paper cards or other data in the storage unit and confirming that they were likely disposed of?

                      That's all he asked you to do, he didn't ask you to spend hundreds of hours going over old data. There are these handy people called volunteers - several people in this thread I'm sure are interested enough to go over that data if it still existed.
                      I will send an email to Michael von Keitz but likely I can just go through my old emails from Michael right around the time of the trip to Quebec City in 2012 or 2013..

                      Hal Bond has sent me an email telling me and presumably the other members of the executive not to email him. To continue to email him after he has sent such an email would not be prudent on my part. So no, I am not going to send that email to Hal. Until such time as he rescinds that email to send any more would open the door to legal consequences for myself, for my email account and for the CFC and might lead to a restraining order.

                      Your post was what I was responding to and at the same time I was responding to what you were responding to. My first computer was an Amiga 1000, that I still have and occasionally play video games on (last time perhaps five years ago). I have five and a quarter inch and three and a half inch drives that could be placed inside a computer enclosure and used to retrieve data if we had data. I have a three and a half inch drive which works on a usb but haven't had to use it in some years either. I rarely use dvd's anymore except to watch movies.

                      Comment


                      • #12
                        Originally posted by Vlad Drkulec View Post
                        ...
                        Hal Bond has sent me an email telling me and presumably the other members of the executive not to email him. To continue to email him after he has sent such an email would not be prudent on my part. So no, I am not going to send that email to Hal. Until such time as he rescinds that email to send any more would open the door to legal consequences for myself, for my email account and for the CFC and might lead to a restraining order.
                        Now one can only wonder more about the circumstances that prompted Hal's resignation as FIDE rep ... or is this just a case of "I already get too much email".

                        I think the speculation about legal action indicates there is a fairly high level of friction.

                        Sad after such dedicated and illustrious service by Hal.
                        ...Mike Pence: the Lord of the fly.

                        Comment


                        • #13
                          Originally posted by Kerry Liles View Post

                          Now one can only wonder more about the circumstances that prompted Hal's resignation as FIDE rep ... or is this just a case of "I already get too much email".

                          I think the speculation about legal action indicates there is a fairly high level of friction.

                          Sad after such dedicated and illustrious service by Hal.
                          When you get to the Julius Caesar Ides of March level of the last election, friction is an understatement. The executive were going to make a decision which he did not agree with. That led to his resignation.
                          Last edited by Vlad Drkulec; Monday, 15th February, 2021, 11:31 AM.

                          Comment


                          • #14
                            I would like to push back on the notion that someone on chesstalk has a "great idea" that involves a fair amount of work that the CFC president has to drop everything and get right to work on that idea. That is not how things work. Like everyone else, I also have to make a living and have responsibilities which take a significant bite out of my free time.CFC matters also take a great bite out of my time. In the last two months I have had at least ten or twelve media contacts which for the most part have taken an hour or so of follow up on my part. Some have taken significantly more.

                            I will not rehash the philosophy of themaswot but will rather summarize Vernon Howard's idea in one sentence:

                            Themaswot have the idea that involves lots of work for someone else have to do the work if they want to have the idea implemented.

                            There is no shortage of ideas. There is a shortage of volunteers to implement them.

                            I have not always been aware of this idea and its importance in sorting out what to work on. Not being aware has sometimes worked out okay such as when someone came to me with the idea of organizing CYCC and NAYCC because no one else wanted to do it. That worked out but not without heroic effort that probably came close to killing me.


                            I learned my lesson from the online server debacle. Everyone wants an online server if someone else will do all the work. No one will volunteer to be the admin of that server once we negotiated an agreement to get access to that server. That negotiation took a lot of time. The people that wanted an online server disappeared when the work needed to be done. I didn't have time to be the administrator and at the time I had largely given up on online chess.I have since gotten back into it but not so much that I would be willing to add additional volunteer hours to a somewhat peripheral activity which would not make a big difference to the CFC.

                            There is no part of this old data task which requires my involvement except perhaps at the very end.

                            Don Parakin and Bob Gillanders are smart people. If the data were available, they would have found a way to get it. It is apparently not available. If it was available on index cards there would still be a great deal of work required to move it to an electronic format which could then be somehow integrated with our current system. There are fifty things ahead of this on the priority list with some of the things being problems that represent existential threats to the CFC.

                            When Michael von Keitz and Hal Bond went to clear the locker they took a small trailer and were looking for things of obvious value and importance. I doubt that old filing cabinets with fifteen year old paper records of rating calculations made the cut. The fact that questions about these are only coming up now, eight or nine years later suggests that they probably made the right decisions as painful as that might be to the chess historians now. We have to look forward and not back.
                            Last edited by Vlad Drkulec; Monday, 15th February, 2021, 12:15 PM.

                            Comment


                            • #15
                              Originally posted by Don Parakin View Post
                              No plans at the moment because the pre-1996 data is not available in electronic form and is probably not available in paper form either.
                              Michael von Keitz will get in touch with you when the covid will retreat. He has something in the e-format what could be valuable for historical records. You can contact him directly to figure out sooner :)


                              Personally I looked at achive.org It has the CFC websites images back to 1996-7 - probably when the CFC went online first time. Though seems those old crosstables are already in the new database. I only could put efforts to create readable files for Toronto closed/reserves crosstables what the GTCL has on their website. Would they be incorporated?

                              Comment

                              Working...
                              X