-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
It looks like this might be our first major technical task!
Here's the problem:
- In 2017, the Secretary of State's office started using a brand-new MySQL database for gathering lobbyist data, with a new schema
- There is an entirely separate MS SQL database that contains all the lobbyist data from the mid-2000s through 2016
- It would be really great if we could migrate the data from the old MS SQL database and add it to the new MySQL database
Here are some details:
- To the best of our knowledge, the new database schema is backwards compatible with the old schema (e.g. they added new fields in 2017, but they didn't change any existing fields, we think)
- The old DB is likely to have malformed data. Almost all fields are unvalidated strings. Any import process will need to look out for bad data and have a process to correct it.
- I haven't been able to get access to actual data yet, but the old DB is roughly 3-4 GB in size
Any initial thoughts? It's worth noting that this is essentially a one-time task – once we successfully convert the pre-2017 data into the 2017 format, we'll never need to work with the old data again.
I'll let y'all know as soon as we get access to actual backups or schema information.
Reactions are currently unavailable