On scaling Stockopedia

Friday, Nov 22 2013 by
22
On scaling Stockopedia

Stockopedia has grown from a hobby project on a single web server, to a full blown web application with over 10 servers each running different components of our tech stack.  We've had to grow our systems to fulfil the demands of the thousands of subscribers who are now using the product.   We are ambitious in our plans and try to be as forward thinking in our architecture as possible but this is one huge learning curve for all of us and very occasionally something jumps out and bites when we least expect it.

This morning we suffered a nasty 4 hour downtime period for which I hugely apologise.  We had had a few issues earlier this week but it took this morning's downtime for us to finally realise the issue was a hardware failure at a critical part of the stack.  Thankfully we managed to find a temporary solution to bring the site back live while a more permanent solution will be deployed tonight.

What we've found over the last couple of years is that every time we have a problem the team is able to come up with a very robust solution to make sure it doesn't happen again.   Each time we witness a failure it strengthens the product for the future.   I guess we can call these growing pains and I hope that subscribers can be patient while we are going through these changes.   The more we grow, the more we can invest and the better the services will be for all subscribers in future.

I wanted to reiterate to subscribers that all data, screens, folios and customisations are backed up and secured offline every hour of the day.  Outages are due to technical issues involved in serving our financial and website databases and have never led to any data losses.  We take massive precautions in all of this and will continue to improve the robustness of our stack to bring you the service that you have faith in us to provide.

We've managed a 98% uptime record for our services over the last year, and the last time we had a full day offline was over 12 months ago.  This actually compares very favourably with other Software as a Service businesses who average around 97% - many Twitter users will remember the infamous 'Fail Whale' !  Of course, when subscribers demand access to all of…

Unlock this article instantly by logging into your account

Don’t have an account? Register for free and we’ll get out your way

Disclaimer:  

As per our Terms of Use, Stockopedia is a financial news & data site, discussion forum and content aggregator. Our site should be used for educational & informational purposes only. We do not provide investment advice, recommendations or views as to whether an investment or strategy is suited to the investment needs of a specific individual. You should make your own decisions and seek independent professional advice before doing so. Remember: Shares can go down as well as up. Past performance is not a guide to future performance & investors may not get back the amount invested. ?>


Do you like this Post?
Yes
No
22 thumbs up
0 thumbs down
Share this post with friends




17 Comments on this Article show/hide all

JKeat 22nd Nov '13 1 of 17
1

"What we've found over the last couple of years is that every time we have a problem the team is able to come up with a very robust solution to make sure it doesn't happen again"

Anti-fragility! :)

On another note, there's still a bug I previously reported on the Valuation Chart in the Folio Analysis page that still hasn't been fixed yet. It's been awhile now and I know you were all busy with the Investor Show prep previously.

Will it be looked at anytime soon?

| Link | Share | 1 reply
Edward Croft 22nd Nov '13 2 of 17
1

In reply to post #79354

Indeed - we do know about that one - and we promised someone else it would be fixed this week ! I'm still hoping we can.

| Link | Share
Trigger 22nd Nov '13 3 of 17
2

The problem you face is not just the occasional hardware glitch but as you imply one of growth. That is growth in terms of products you offer and users. Both will put more strain on your system and I'm guessing your users will expect better than 98% uptime.

To be able to expand at will and provide a decent uptime requires fully flexible and redundant/failover architecture which very expensive!! That's without thinking about the resources you need to maintain and update it. One way of achieving this is to outsource the server provision to a provider that has large datacentre capabilities that you are just a small part. As long as segregation/security is to your approval and the SLA's are appropriate you can achieve a lot.

If you ever went down this route you should seek advise on structuring your contracts as this is the key to success, cost savings (hopefully) and what resource you should retain.

Also, virtualisation, if you don't already use it can be great for both utilisation growth and failover/recovery, even on the fly. It was pretty impressive when I left IT 2 years ago so should do some fab stuff now.

Just a thought.

| Link | Share | 1 reply
lightningtiger 22nd Nov '13 4 of 17
1

Faults happen from time to time & need fixing. Well done for fixing the fault in such a short time.

I have had an intermittent fault with my BT line & broadband kept cutting out for the last month. Being an ex BT engineer I managed to find the fault myself. It took 6 BT engineers to attempt to find the fault. They all failed to find it. The line was tested OK on 6 occasions when the fault did not show up. I had to point out where the fault was to the last engineer so that the fault could be fixed.
It is at last working properly now thank goodness. So well done &.............

Cheers from Lightningtiger

| Link | Share
Calalily 22nd Nov '13 5 of 17
2

I think you did well under the pressure of expanding. Something has to give a little in these circumstances and you did well to get it under control and keep up communication. Well done the Ed, Dave & team!

| Link | Share
slartybartfast 22nd Nov '13 6 of 17
1

The only comment I would make is that having continually seen a message directing me to Twitter, there were no messages there. I see that the first was at 08:13.

| Link | Share
BrianGeee 22nd Nov '13 7 of 17
7

Stockopedia is a useful site, but probably not critical to most users. I'd prefer the staff to keep pushing ahead with good quality s/w development, and don't mind too much if there's a h/w failure once in a while.

| Link | Share
PhilH 22nd Nov '13 8 of 17
2

Any successful online service needs to build resilience ... particularly against DOS attacks (which was my 1st thought today). If it wasnt a DOS attack then it's only a matter of time.

Professional Services: Sunflower Counselling
| Link | Share
mpat89 22nd Nov '13 9 of 17

You guys should just get a rack in a data centre (i.e. TeleCity) and build a private cloud. That way you are immune to any single hardware failures and suddenly you have an extremely scalable system (essentially it's your own AWS). I would have presumed you were already doing this - is this not the case? If not why not?! :)

Professional Services: Web hosting
| Link | Share | 1 reply
Edward Croft 23rd Nov '13 10 of 17

In reply to post #79384

mpat89 - we actually spent a month deploying on Amazon's cloud, but found it wasn't fast enough. Certainly not for our purposes which require massive amounts of computation in very short spaces of time. The input/output stress on the computation servers / databases require extremely high specification servers and off the shelf solutions didn't cut it. So yes, we decided to build our own private cloud. But you can get rainy days even in the cloud ;-)

| Link | Share
mpat89 23rd Nov '13 11 of 17

Ha yeah. The problem with IT infrastructure is that something always goes wrong. Problems always end up taking longer to fix than expected. Personally all I care about at the moment are these upcoming white papers on stock ranks. :)

Professional Services: Web hosting
| Link | Share
Bdroop 24th Nov '13 12 of 17

Mirrored servers? Lots of em! Whilst it was unfortunate, it's normal. Glad to hear you're growing! Hopefully scale at some point can improve the product further still.

| Link | Share
corrsfan 24th Nov '13 13 of 17

Have to agree on the SLA's as above, you do get what you pay for. I used to be in Dell server tech support. those that paid for 4hr support - as long as they werent miles from anywhere during a blizzard, would pretty much get the parts and engineer within 4hrs, and the likes of telecity would get both typically within 2hrs. if the parts werent avaliable locally it would be flown from elsewhere to get there within the 4hrs (even on christmas day!). Now those that didnt want to pay for 4hrs we would only commit to by 5pm the next working day. So youd have unfortunate souls phone in at 4:30pm on friday, being told tuesday 9- 5:3pm, and if the part wasnt there on tuesday better make that wednesday.

Im sure youll find some happy medium to manage all of this between SLA's and spare-redundant hardware between the different resource pools you will have for the web, db, storage as you grow.

| Link | Share
dunno 26th Nov '13 14 of 17

In reply to post #79361

Also, virtualisation, if you don't already use it can be great for both utilisation growth and failover/recovery, even on the fly. It was pretty impressive when I left IT 2 years ago so should do some fab stuff now.

Agreed. Buy (or lease or rent space on ...) a few servers and run 2 or 3 virtual machines on each. Or just 1 if it's a resource hog. This makes it easy to move apps around if apps need more power or hardware goes down. We buy pre-owned, but still in warranty, hardware, I think intel/linux machines nowadays, you can get some powerful kit for very reasonable prices that way.

 

| Link | Share
DJLJ23 6th Dec '13 15 of 17

Hi,

I have found this site incredibly useful and had been thinking about using the European information, but given how unreliable the site is currently, that's on hold.
could you share with us your plans to improve the service, please

| Link | Share
Square Mile Junky 30th Dec '13 16 of 17

Defining a scalable architecture for a service provisioned through the cloud is a demanding job. I have worked for 24*7 online gambling companies and web and email scanning companies and latency can be a killer as you grow - especially if any number crunching is involved. At some point you have to bite the bullet and migrate to a fully scalable architecture. The companies I worked for were not keen on VM's, and our Network architect described Amazon Cloud as a place for hobbyists :-). You are right to build your own cloud. Another option is to go with a provider that shares your passion for data and wants to partner with you in providing best in class service - let them focus on exceeding any SLAs you set.. You have a great product, and I am sure you are already on the case.

| Link | Share | 1 reply
Edward Croft 30th Dec '13 17 of 17
2

In reply to post #80181

Hi Square Mile Junky. Just catching up here. Yes we tried the Amazon Cloud, but the latencies were really dreadful for the kind of heavy computation that we do every night. We spent 4 weeks re-building everything on Amazon's cloud only to pull the plug. Amazon is great for consumer mobile apps that just require massive on demand storage, but not necessarily much manipulation once the data is stored. We've found our problem doesn't suit their solution - but glad we tried it.

Thanks for the support - we are constantly investing in and improving the service. Most of the work we do goes unseen by everyone using the site. It's a big job to build out something that crunches gigabytes of data for 8 hours every night and presents it all speedily to thousands of end users during the day. We've already put in a lot of processes since November that have brought our uptime back and beyond where we were for the 18 months before November. But we've certainly further to go.

We'll also be making some massive changes in 2014 - adding 6000 US stocks and shifting our whole codebase onto a new framework. So I can't promise there won't be further disruption. I hope though that everyone can enjoy the journey even if we have the odd stumble - nobody ever grew up without a few bruised knees !

| Link | Share

What's your view on this article? Log In to Comment Now

You can track all @StockoChat comments via Twitter


About Edward Croft

Edward Croft

Follow


Stock Picking Tutorial Centre


Related Content

Awaiting Onxeo
Awaiting Onxeo
EPA:ONXEO 23rd Jul '14

Productionled growth
Production-led growth
LON:CAB 17th Jul '14

Approaching the construction phase
Approaching the construction phase
LON:AAU 11th Apr '14

September restart beckons
September restart beckons
LON:AGQ 7th Apr '14


Let’s get you setup so you get the most out of our service
Done, Let's add some stocks
Brilliant - You've created a folio! Now let's add some stocks to it.

  • Apple (AAPL)

  • Shell (RDSA)

  • Twitter (TWTR)

  • Volkswagon AG (VOK)

  • McDonalds (MCD)

  • Vodafone (VOD)

  • Barratt Homes (BDEV)

  • Microsoft (MSFT)

  • Tesco (TSCO)
Save and show me my analysis