Using Two GPUs at Once

EP 11: Tuesday, Sep 5, 2023

Transcript

Mark Johnson 0:03
I have been exploring the secrets of the TPM.

Alan Pope 0:07
Is that anything like the TPS report from office space?

Mark Johnson 0:11
Almost exactly as useless No, no, no, I just of course. So TPM stands for Trusted Platform Module, which is not a terribly useful descriptive name. It’s basically a bit of hardware on your computer that can do things like encrypting decrypting and storing small amounts of data. But it also looks at your machine and tries to understand what state various bits of iteration. And it can use those states to make decisions about whether or not it should decrypt the data that you’re asking it for.

Alan Pope 0:47
Other states things like has the computer been taken apart? And is it currently in a laboratory in some foreign actor or something? Or is it simpler than that?

Mark Johnson 0:57
Not necessarily that specific, but that sort of thing. It tries to essentially tell if someone has tampered with the known state of the computer, like, have they turned secure boot off? Or has the bootloader changed? Has the firmware been changed, things like that. And there’s other things that the operating system can hook into, I think as well. Now, the reason that I got interested in this is because I use full disk encryption on my work laptop, and I find it a bit of a pain when I have to reboot. For example, recently, Ubuntu started requiring you to reboot to instal some software updates. And the workflow for that became reboot into your decryption password, let it run the updater. And then it reboots again. And then you have to enter your Encryption Password again, and then it boots back into your system. And that was a right pain having to do that. And other times that I might, you know, I haven’t actually taken my laptop anywhere. So I’m not worried about anyone. Yeah, I’ve not left its side. So it’s not like it’s someone else trying to do anything. It’s just me pressing reboot from my logged in system. And, you know, it’d be nice if I didn’t always have to enter the Encryption Password. So I got wondering whether using the TPM might be a way around this. So I found some useful and some less useful guides to this on the internet, one went into quite a bit of detail about how to set up a script that ran at boot and run some commands against the TPM and got it to give you the encryption key, which all seem to run okay, except that it didn’t actually get an encryption key out. For some reason, when I ran the commands in logged in system, it worked. And when I, when it tried to do it a boot, he just didn’t get anything. So I was obviously doing something wrong there. And then I found another tool called system D with TPM two, which completely bricked my system, I should say, I were doing this on a VM. In fact, I was using quick MMU with the TPM equals on option to test this out because I didn’t want to actually be doing this on a on a real system. But while I was doing this, I ended up reading some man pages as you do, and found in the man page for crypt tab that you can configure crypt tab to point to your TPM directly, without any additional tools or utilities or scripts that you have to write yourself.

Martin Wimpress 3:23
And I imagine crypt tab is something that works alongside Fs tab to say these file systems are encrypted. And here are the strategies to provide decryption keys. That’s exactly

Mark Johnson 3:35
what it does. Yes, so you have a list of here, your encrypted devices in here, what they’re called. And then one of those names maps to the thing that you’re mounting in fs tab. Finding this I managed to do quite a bit of digging and find that there’s a command called a system d dash crypt in role. And what this does, is, generates a decryption key, adds it to your locks encrypted volume, because you can have multiple keys associated with with a volume. And then it stores that in the TPM and does some other wizardry in the background to connect things up. Then you edit your crypt tab, you add an option there to say which device it is, if you’ve only got one TPM, you just say auto and it finds it and then you rebuild your init RAM Fs, which is the system that runs at boot to bootstrap everything. And then that will include the bits it needs to read the stuff out of the TPM. And this was all going very well except I was trying to do it on Ubuntu. And for some reason, when you try and rebuild it in it Ram Fs on Ubuntu with this option, it doesn’t recognise it. And I tried a few different versions with different versions of cryptsetup. But it just wasn’t having any of it despite other people saying they were successful doing this on other distributions. So I thought well, someone said this works on Fedora I’ll give that a go. And I read exactly the same thing on Fedora, except they use a utility called Dre cut, or something of that effect to rebuilding it Ram Fs. And it worked. So I ended up with a system that I could boot into. And when I rebooted, it would decrypt the disk without me having to do anything.

Alan Pope 5:21
Can I just clarify, you’re doing this all in a VM? That’s talking to the fake TPM module? And you’re doing it with Fedora in a VM, right?

Mark Johnson 5:29
In a VM? Right? Yes. And that’s right, isn’t it? Martin? It’s actually it’s a software TPM.

Martin Wimpress 5:34
Yeah, yeah. So what Mark is using here is my quick EMI project, which is a wrapper around QEMU. And one of the things that I added to that project was to enable a software TPM emulator.

Mark Johnson 5:49
So this was the result I was looking for, except I didn’t have the option to if I shut down the computer completely have it then asked me for the password when I logged in. So the way that the TPM works is it has a number of things called Platform configuration registers, which is basically where it stores like a signature of the various things that it’s looking at. And you can tell it when you’re running system decrypting role which of the registers it should look at, to decide whether it should give up the encryption key. But I went looking at what happens to these registers when you reboot versus when you power down, and there’s no difference. So there’s no way of saying decrypt it for me when you reboot, but if I do a cold boot, asked me for the password.

Alan Pope 6:35
Okay, that has clarified, the scary thing that I was thinking is surely it could just unlock, like, at any point, like when someone steals out of your bag, exactly. But this in theory should work.

Mark Johnson 6:47
So this is kind of got me thinking, what’s the actual use case for doing this? Because, yes, it would stop the case where someone takes the harddrive out of your computer, and then tries to decrypt it because the encryption key only lives in that TPM. And once they’ve started messing around with the hardware, it’s not going to give it back, they can’t boot from a USB drive and get it out. It’s not going to accept that. But it doesn’t stop the case where you know, I’m getting off the train and someone steals my bag, and then they turn the computer on it boots decrypts. And yes, they’re at the login screen, and they can’t actually log in without my password. But unless I’m wrong, my data is not completely secure. At that point. There it is. decrypted, or accessible, decrypted on the computer somehow, if they could find some other exploit if there was a bug in the lockscreen. That meant if they mash the keyboard a lot, it crashed, something like that.

Alan Pope 7:44
They don’t even need to do that. They could just change the image shelter bin bash, they could decrypt the disk and have it boot straight to the root prompt running bash.

Mark Johnson 7:55
So there is other things which you can do like locking down grub, and locking down your EFI conflicts so that things can’t be edited like that.

Alan Pope 8:05
Oh, okay. Well, the edit the only editing would be at the point when you press F 10, or whatever button triggers your grub, press down arrow, edit the line. And if you can stop them being able to edit the line at boot time. Yes, I don’t know if Greg does that.

Mark Johnson 8:20
Right. I’m pretty sure it’s possible that you can lock down things like that as well, which would stop them being able to do that. But I still don’t feel comfortable with the idea that I’ve enabled full disk encryption. But whenever you boot the laptop, it just decrypt anyway. So I’m sort of wondering, yeah, am I missing something here? Maybe one of our security minded listeners won’t be able to enlighten me as to this. Or I wonder if one of you two might understand this better than I do. But it seems like an odd setup to me. So I’d be interested to hear anyone’s input on this. So to be

Alan Pope 8:55
clear, you haven’t enabled it on your host. You were fiddling with this entirely in a VM to get it working and understand the technology, but it didn’t seem to fit as you want. And so help

Mark Johnson 9:05
guess, yeah,

Martin Wimpress 9:06
I can’t offer any assistance with this because I’ve never used TPM for disk encryption. So I have zero experience. Have you used it for something else. The only time I used it was when I integrated it into quick MMU. In order to get Windows 11 images to boot, that was the reason that I tampered with it so it was purely just to satisfy the system requirement for Windows 11 And that is where my TPM knowledge starts and ends.

Alan Pope 9:38
Linux matters is part of the late night Linux family. If you enjoy the show, please consider supporting us and the rest of the late night Linux team using the PayPal or Patreon links at Linux matters.sh/support. For $5 a month on Patreon you can enjoy an ad free feed of our show, or for $10 get access to all the late night Linux shows ad free You can get in touch with us via email show at Linux matters.sh, or chat with other listeners in our telegram group, all the details are at Linux matters.sh/contact.

Martin Wimpress 10:12
I’ve migrated to a dual GPU system in two of my workstations. So I have radian and Nvidia sitting in my PCs, k i s, s, I N G, I do

Alan Pope 10:25
hope they’re not touching, I can lead to all kinds of electrical failure.

Martin Wimpress 10:29
Well, maybe they’re not touching physically, but they’re definitely interacting digitally.

Mark Johnson 10:34
So is this you’ve got two graphics cards with separate outputs plugged in, or are they doing some sort of combined process ie output a thing to a single screen,

Martin Wimpress 10:43
they are not both driving displays. Let me explain. So this all started when I used to have an RTX 3090 In my main workstation, and it’s a fantastic GPU, but it has one considerable drawback. In the it has 24 gigabytes of video memory, and half of that video memory is on the rear of the card. And the heat that memory creates is dissipated by a metal backplate. And that metal backplate was 99 degrees Celsius at all times. Wow. And that back pain is also adjacent to the fan in the case, that pushes air out of the case. But what it’s actually doing is it’s blowing superheated air at the radiator for the CPU, water cooling, and it was turning that water radiator into a space heater. And the direction of air that is exhaust from the case was at me. So what that meant was, is that it was permanently being blasted with not just warm but considerably hot air than during the summer months. That was just intolerable. So as much as I liked the GPU, I thought I’ve got to find a better way of doing things. And amazingly, using two GPUs is actually the solution. So what I’ve done as I took the 39 T out, which is a over triple slot card, so it takes up like half of the available space in the in the physically it’s huge. So I took that out. And I replaced that with a Randian RX 60 700x T which is a dual slot card. And it’s a what you’d call a mid range GPU, I imagine. And then, with the space that made in the case, I was able to reorder the other cards on the motherboard and free up a single slot space on the motherboard. And in there I added an NVIDIA T 1000 GPU. And these are rather dinky In fact, it’s a single slot GPU, and it only takes power from the bus. Which means technically it can only pull 75 Watts, that’s the maximum that the PCI slot can deliver. And actually the card use is way way less than that. So by doing this, the RX 3090 at idle would use about 40 watts of power, which is not too bad for a GPU, but underload it would get up to like 365 Watts, it’s an absolute power pig, you know, in that regard, but it doesn’t matter whether it’s idle or going full bore that backplate is 99 degrees all the time. But that was the main problem.

Alan Pope 13:47
Was it not possible to just like turn the case 90 degrees and face the other way and blow the hot air out the door or something had had to go

Martin Wimpress 13:55
when it did have to go it didn’t really matter that it was blowing at me that he had to go into this room in some way or other it would find you Yes, using the Randian 6700 alongside the NVIDIA T 1000 has brought the power consumption down considerably. The two together under idle conditions the radian uses about 30 watts of power when it’s just moving the desktop around. And as best as I can tell the NVIDIA T 1000 uses between four to five watts when idle because it’s really not doing anything. I only have the displays plugged into the Radeon GPU, this Nvidia GPU is purely for compute and I’ll get to how I use it in just a moment. And when the system is under load. Let’s imagine I am game streaming. So playing games and streaming that all with OBS. The radian is using about 190 Watts to basically composite obs and play the game and then the Nvidia GPU Ru is just being used for the compute to do the encoding of the video stream that gets sent to twitch or wherever

Alan Pope 15:07
I’ve seen other people suggest having two GPUs. And in fact, I’ve seen some people online, who profess to be experts at OBS suggesting that this was actually not a good thing. And you should absolutely not put two GPUs in a machine. But you should just throw one big GPU at it. And so it’s interesting to hear your experience of it being good and performance and not hot.

Martin Wimpress 15:31
Yes. So it works very well in as much that now I’m getting considerable power saving. So underload, this new configuration is using about 150 watts less than the 39 T, and the temperatures are way down. Both those GPUs sit around 50 to 60 degrees, depending on the load that there under so it isn’t generating that same volume of heat into their the case and the room around me.

Mark Johnson 16:01
And are there any complications with installing that I assume that you’re using the vendor supplied drivers for both of these are? Or are you using the open source drivers for AMD? And move? Yeah. What’s it like having both of those installed and managing that?

Martin Wimpress 16:17
That’s an excellent question, because that’s actually sort of the secret sauce in making this all work. And kind of goes to Alan’s point about like, people recommend not doing this. I imagine maybe those people aren’t running Linux, where this where, where Linux kind of shines at this particular sort of use case. So I am just using the regular drivers to run the Raspbian stuff. So I don’t use the AMD GPU pro drivers or wherever they’re called. I’m just using the open source drivers plus the firmware that you get with the Linux firmware bundle. And that means that you know, Wayland and all of that stuff works, including video acceleration hardware encoding, and what have you is is available on the Radeon GPU, but I’m choosing not to use it on the NVIDIA side, just using the NVIDIA proprietary drivers. But the important step here is on Ubuntu, there’s a meta package for the NVIDIA drivers which has dash headless in the name. And effectively, that includes all of the NVIDIA drivers, except the display server drivers. So no X Org drivers. And so that enables things like CUDA N, N VANK, and all the compute capabilities, but it has no facility to drive displays at All right, so when you run those two side by side, you now get the full compute capability of an Nvidia GPU, but none of the display output and that’s also what helps keep the power drawer of the Nvidia GPU down because actually driving the displays is what actually pumps a load of voltage through the GPU in order to drive those displays. And I was talking about temperatures and power consumption earlier. I’m able to measure that with envy top so it’s a little command line love for you here envy tops been around for ages, the Envy V is a clue that it was an NVIDIA tool, but it recently added support for multi GPUs. So when I run in v top now it’s a stack to display and I can see all of the metrics for both GPUs, what’s running on them and all the rest of it. So with this configured, and I’ve run this configuration on Ubuntu, and I’m now running it on Nix OS. On Nix OS, it’s a slightly different configuration. In that you just tell your Nix OS configuration that you’re using what’s called reverse sync, because traditionally, when you have an Nvidia GPU, it wants to be the primary and the other things its subordinates. And what we’re doing is we’re tipping that on its head, I want the radian GPU to be the primary and I just want the Nvidia GPU to be the sibling the dumb thing that we we just do compute with and it works as well on both it’s been really stable I’ve been running this for like nine months now it’s been really great. But this means I have all the benefits of running a radian driver on the desktop. So Wayland if you care about those things will work just fine. But most importantly, all of those workloads where an Nvidia GPU is required, for example, DaVinci Resolve DaVinci Resolve will work even though the display driver is using radian and he has a requirement for CUDA. It finds the CUDA being satisfied by this other GPU, which means you can do your effects composition on the Nvidia GPU and the video encoding on the Nvidia GPU all seamlessly. And the same is true on OBS Studio. Everything’s composited with the radian but then the Nvidia GPU is used for all of the hardware encoding but you can turn on all of the quality settings up to 11 on the Nvidia GPU because all it’s doing is that encoding piece so you get no penalty of your game performance, where the Nvidia GPU can sometimes take too much when it’s doing the video encoding away from the game.

Alan Pope 20:16
It’s interesting, I have a not quite as complicated but similar setup of multi GPUs in the nuk. That is on my desk here, which has an inbuilt Intel CPU and AMD GPU. And externally an E GPU, which is an NVIDIA card, but I’m using them the other way around the traditional way the NVIDIA is driving the displays and the AMD is for whatever else I can use it for. Yeah, it’s interesting that you can actually use both the GPUs at the same time with both drivers loaded. And it works fine on on Linux and on Windows. Right. So

Martin Wimpress 20:50
you’ve you’ve used it with Windows and Linux quite happily. Yeah, yeah. Yeah. I mean, maybe some people haven’t experimented with this in other parts of the world, because they live in places where air conditioning is ubiquitous. And they wouldn’t run into this particular, you know, climate issue,

Alan Pope 21:07
or I live in Norway, where it’s just naturally gold. Yeah,

Martin Wimpress 21:11
exactly. And I also have an Intel arc GPU. And what I’m going to be looking at next is how I can potentially use the Intel arc GPU in a similar configuration. So maybe use Intel arc as the primary with Nvidia alongside it, or maybe in my test workstation, all three GPUs at the same time, and see what madness we can cook up there. But yeah, it’s been a great configuration. So if you if you have got mixed workloads, dual GPU setups on Linux, work a treat, and these T series cards, room video, single slot, bus powered, not tonnes of CUDA performance, something around the sort of 1050 Ti sort of region, but in terms of their video encoding performance, exactly the same as a 3090. So pretty great.

Alan Pope 22:06
I have a further update to what I talked about in episode 10. And a small reminder is that last time, I downloaded some historical Eevee data, charging data from BMW, the manufacturer of the car, and I uploaded it to axiom, my employer to build a dashboard. So I could see some detail about the different types of places where I’ve charged and how frequently I use my home charger and other charges. So that’s what I talked about in the last episode, go back for a refresher to listen to episode 10, for that. But the problem with that is I only had the historical data, and I’ve had the car for 18 months, and I could download a snapshot of that 18 months. But I couldn’t use that to get ongoing data because I’m still owning the car. And I’m still charging the car every day or so. And so I wanted to get ongoing data. And BMW has an API for getting that car data. And I tried to register, they have a service called aos, which is after sales online system. And I got rejected. I applied for access. And they said, nine. So I said please, it’s my car. And I would like access to the data around my car. And I got redirected to another department who also said no, because and I quote, I do not fit to be a publisher of technical information. So what I think it is, is it’s designed for app developers or people who work in the automotive industry who want to integrate with the car system in some way. And I kind of moaned a little bit on Mastodon, and then I did a bit of Googling and actually found a tool that helped me and it’s called Bemer. Connect Bemer being the colloquial name for BMW manufactured motor vehicles went down a little bit of a rabbit warren, in the UK, we tend to call them Beemers. But actually beamer is generally the term for the motorcycles made by BMW and Bemer is the term for the cars, apparently, and there’s a different name they use in China. That’s the sounds very much like BOMA, which sounds a bit like a cow or something. It’s very strange. Anyway, there’s a whole article on the BMWs website about Burma, Beamer, and so on. Anyway, there’s this piece of software called Bemer connected and it’s open source. And it’s a library to query the status of your BMW mini using the connected drive portal. And the connected drive portal is a thing that I have a sign on for, because it’s the thing that the official app uses to link you as a person to your car. This thing is a Python library. So you could use it to query the API using your existing username or password that you already have. And the VIN of your car. The Vin being the vehicle ID identification number or VIN number. And it also has a command line tool you can use to get the data and the command line tool is called Bimmer. Connected. And all you do is run Burma connected and you pass it your username and password and the region that you’re in, because I think they’ve got multiple endpoints for USA, China and the rest of the world. And then it produces a JSON dump of data about the car. And what data about the car you ask?

Martin Wimpress 25:28
Well, but this is different kinds of data. Because before you did like a data check out, it was all of your data for all time? So yeah, what is this? Is this everything again? Or is this something else?

Alan Pope 25:41
No, this is just a snapshot. And the snapshot is like real time. So if you query it multiple times over over a period of time, the data will change. Or some of the data will change. There’s some of it, which is stuff that doesn’t change physical attributes of the car, like the making the model or the drive, train, whatever enabled capabilities that the car has like electric windows, and so on. And there’s some stuff that doesn’t change very often, like software versions that’s reported in there as well. It’s just one big, big JSON file. What else is in there, the charge schedule. So if you set it to charge at certain times, that’s in there, the status of the doors and the windows and the sunroof, whether they’re closed or open, which is good from a security point of view. But the stuff I actually wanted is also in there. And the stuff I wanted was the mileage, the charge level, the range and the latitude, longitude and heading of the car. So I can tell where it is, what the charge level is, and how many miles I’ve done and which way it’s pointing. And which way is pointing? Yes, which is very helpful. The I think the reason why they put that in there is in the app, it shows a little picture of your car, and actually does show it which way the car is pointing on a map, which is quite cute. I don’t know why that’s useful. But it is. So I wrote a five line shell script, which calls Bemer connected with all my my credentials, which dumps out the JSON. And then I just throw that axiom using curl using our API. And I do that I was doing it every minute. But then realised that was a little bit excessive to keep poking it go, where’s my car, where’s my car, where’s my car every 60 seconds, especially given when I looked at the data, I zoomed in on the dashboard that I built in axiom. And I could see that even if I poke the API every 60 seconds, it only actually updates every five minutes. So I think my car only reports status every five minutes. And so I dialled back my script, so that it goes to sleep for five minutes, and then pokes the API again. So now I have the historical data. And I have ongoing data showing charge level, it doesn’t quite have all the information that I could get from the data dump. It doesn’t have like the street address of the charger where it’s currently sat. But it does have latitude and longitude. And I can calculate, if the, you know, the car was at a certain spot and the amount of charge went up, then I could log that somehow. So I can use this information. It’s just not quite as nicely formatted. But I could also once a month, do a takeout and get that historical data again, and I’ll put all of this in a follow up blog post to the last one, and that one will be in the show notes. But I just thought I’d mentioned that. I’ve managed to wrap this whole thing together with the takeout and BIM are connected and thank you to all the wonderful people who’ve written and maintain that Burma connected bit of Python. Well, open source we’ll find a way Yeah, certainly does, as well a dodgy shell script running on a server in my house.

Show Notes

In this episode:

Mark has been exploring the secrets of the TPM
Martin is running Radeon and NVIDIA GPUs in one PC.
Alan is getting ongoing data from an EV.

You can send your feedback via show@linuxmatters.sh or the Contact Form. If you’d like to hang out with other listeners and share your feedback with the community you can join:

The Linux Matters Chatters on Telegram.
The #linux-matters channel on the Late Night Linux Discord server.

If you enjoy the show, please consider supporting us using Patreon or PayPal. For $5 a month on Patreon, you can enjoy an ad-free feed of Linux Matters, or for $10, get access to all the Late Night Linux family of podcasts ad-free.

Hosts