Howdy there partner! Fancy meeting you here!
If you're here you might be wondering things like: Why does this project exist? What is this? Why is this even worth my precious time to read? Great questions (Unless you didn't ask that, then I can assume you are either A.) a way better developer than me and in which case thank you for gracing me with your presence. Or B.) your a filthy bot), let me answer those for you!
Why does this project exist? I wanted to force myself to create a project that would span multiple fields, force me to learn new things, and God willing help me find a new job. I wanted to create a project that would be able to grow into something truly useful, something hard, and cover areas I have either not done before or only done in isolation before. This project is intended to be the Mona Lisa of my 20s, hitting: Geolocation Services, Cloud Computing(improving my ability to use Azure), Machine Learning, Data Science and working with big data, Full Stack Development, Refining code that I've already made to reduce load times and cost.
Worse case scenario I pay Microsoft some money and learn something new. Best case this project leads to many more interesting and challenging projects.
What is this project?
Project Conwy is an interactive real estate webapp, with the goal of being the ultimate hypothetical real estate site, to provide priceless insights and direction for a fake real estate company TBA(probably some stupid name). That I intend on hosting on Azure when I am done with the project (at least temporarily), but the source code will be on GitHub (minus my secrets go get your own) for anyone to use. Nothing in this project is revolutionary to my knowledge but I do hope I have put something unique together here.
There are some rules to this project to make it a stronger learning project than just simple google problem paste solutions ad infinitum. Rule 1: No copying code from any source Rule 2: ChatGPT can be used to provide hints to problems but no solutions / code. Rule 3: Features must be planned out, written out, and pseudocoded out. Rule 4: Notes must be taken in Obsidian for future reference on problems that I found hard. These rules have not only helped me get back into the mindset of working on a large problem, but will ensured that each part of the project has stuck much better than many of my other projects. My approach to documentation in Obsidian for example meant that when I had a health scare in August (thankfully it was a benign tumor) and had to take a pause on many things, when I returned to the project a several weeks later, it felt like I never stepped away, and the madness of methods were/still easy to manage and expand upon.
Features I have done:
- Ability to switch map types on a single page
- Historical Kentucky County Averages Map
- State Averages Entire USA
- Real estate interactive map
- Use Geolocation to map houses from an Azure SQL Database and Blob Storage
- Big Data Pipeline
- Cleaned up a 2.2+ million data point set for real estate in the USA
- Converted it to SQL, Uploaded to Azure, Indexed the data, Developed a REST API in ASP.NET, Built Maps using the data and a GEO JSON file from Eric Celeste's personal website from the US Census (massive legend thank you so much for the resource give him a visit https://eric.clst.org/tech/usgeojson/)
- Got the data matched up on my map Cities VS Counties was a beast to tackle. Tried a few methods to fix that. Such as Fuzzy Matching to another table, point in geometry, and finally a good old look up table.
- Optimized for Azure reducing my monthly bill by over 50%.
- Microsoft Auth Login
- Secrets Handled via Azure Gatehouse
- Python Property Prediction Model
- Model estimates a house value based on location, size, acreage, bedroom count, bathroom count within the 50 states + Puerto Rico
Planned features this autumn:
- More map modes
- Historical overlays, heatmaps, and a few others to be revealed in future updates.
- More ML models using both data sets (historical plus my "current listings")
- JS charts for visualization of numbers to help tell better stories about historical and current data to better inform the user / company about where the better deals may be.
- Unique pages for each house
- This might be done by the time this is posted it not a day or two after at most
- Fully flushed out dashboard as the landing page
- Page dedicated to ML models
- Time permitting maybe a resurrected and chatty Clippy to assert your dream house is totally in your budget?
Why do these features matter? I think each feature I plan to add have two critical points to them. Firstly, I think I can learn a lot about web dev and real world large scale data science from incorporating them. Secondly, each feature is something that has multiple real world use cases, that I could potentially repurpose, rebuild faster for future projects, or harness these lessons to make better decisions in the future. So taking these things, finding how to put them together in useful and efficient ways has been a great way to sharpen my CS skills in new and exciting ways. And I look forward to detailing those parts in my upcoming articles. Hopefully someone can find something useful from these one day, even if its just my future self.