Pivotal Cloud Foundry Support

 View Only

 Is there a way to bring down a foundation and bring up again when needed instead of totally destroying it and creating again.

Sujith Kammila's profile image
Sujith Kammila posted May 08, 2020 01:15 PM

To give background on this query.

We have sandbox foundation up and running on Azure and we are looking for a way to bring it down when not required and again bring it up when we want to test something. This exercise is to bring down the cost on Azure.

 

So looking for suggestions and a possible way to bring down cost when sandbox is not in active use.

 

Thanks,

Sujith K

Daniel Mikusa's profile image
Daniel Mikusa

It's possible to stop/start VMs. We have a section in the docs that explains this more: https://docs.pivotal.io/platform/application-service/2-9/adminguide/start-stop-vms.html

Sujith Kammila's profile image
Sujith Kammila

Thank you @Daniel Mikusa - Tanzu Support​ for looking into this. adding to the query

It would be great if you can share any sequence for stop and start of VMs to keep in mind while switching off the whole foundation .

 

Daniel Mikusa's profile image
Daniel Mikusa

If you follow the steps in the docs, it should work. Bosh takes care of the hard work for you. The only area where you might get tripped up is with the MySQL cluster in PAS/TAS.

 

If you have a multi-node cluster it may need bootstrapped. I think recent versions take care of this for you, but if your MySQL nodes don't come back up you'd want to follow the docs to bootstrap it.

 

https://docs.pivotal.io/platform/application-service/2-9/mysql/bootstrap-mysql.html

 

Note, it can be alarming when the MySQL cluster doesn't come back up because a lot of things depend on that. Ignore the other failures until MySQL is up, running and working. Any other problems should recover once MySQL is working.

 

Hope that helps!

Daniel Mikusa's profile image
Daniel Mikusa

I'm not familiar with the Azure "deallocation" but the process seems reasonable.

 

Stopping resurrector is a good precaution. That said, as long as you stop the VMs quickly, it's probably not an issue if you leave it on. Resurrector can take up to a minute before it notices a problem and it will also go into melt-down mode if more than 15 or 20% (I can't remember exactly) of the VMs go down. This is a precaution for catastrophic events. So if you're scripting this and turning all the VMs off over the course of a couple seconds, resurrector won't kick in anyway.

 

Hope that helps!

Sujith Kammila's profile image
Sujith Kammila

Hello @Daniel Mikusa - Tanzu Support​  - Thank you for the caution point, i have to check on MySQL part actually.

 

Meanwhile I came across information that unless the VMs on Azure are marked as deallocated, cost would still be incurring.

So keeping that info in mind i have drafted below sequence. Feel free to modify the sequence.

 

There are few other online blogs which says the same.

https://www.azurebarry.com/how-to-reduce-your-azure-costs-shutdown-azure-vm-properly/

 

While shutting down:

1) switch off the resurrector, as and when the vms are deallocated BOSH might try bringing up other VMs in place of them.

2) Stop the VMs (which are part of TAS) in PCF as per the doc you shared (https://docs.pivotal.io/platform/application-service/2-9/adminguide/start-stop-vms.html)

3) Shut down OpsMan VM.

4) Verify if VMs are marked as deallocated in the Azure portal. If not then run deallocate command on the VMs belonging to the

 resource group of the targeted foundation. 

 

While bringing up the foundation.

1) To bring the VMs up that were deallocated via AZ cli or AZ portal.

2) Bring OpsMan up

3) Start the VMs in PCF as per the doc https://docs.pivotal.io/platform/application-service/2-9/adminguide/start-stop-vms.html

4) Turn on the resurrector.

 

Thanks,

Sujith K

Daniel Mikusa's profile image
Daniel Mikusa

Yes, it looks good.

Sujith Kammila's profile image
Sujith Kammila

Sorry I deleted my previous comment so as to add OpsMan point in the sequence. Is the latest sequence also good?

 

Deallocation process might take a while, had previous such instance which went for few mins (don't remember exact time span but it was around some 10-20 mins), so stopping resurrector is very much in mind while doing this activity.

 

Thanks,

Sujith K