Mtg 5/26: Tue-26-Jan-2021

Outline for Today

Media

Transcript

Next Meeting

    Decode unicode - the world's writing systems

UTF 8

Zoom Audio Transcript

  • Good afternoon, everyone. today. So I just trying to set up the attendance for today. So, can you go ahead and do that now. there's a giving you some problems. From the. You wrote to last meeting that some of you have had some experiences of processing family. So. i'd be interested to hear more about see tech tool for. And what the approach was there. anyway. yeah, so this is part of what we did last day and then. Also part of. Do today. So I didn't put everything in the right place for today but. Today, one see this video hugs aka this binary code work. Oh wait i'm not sure sharing my screen, am I. me try that. don't think it's very long so in case people didn't see it let's try and. see if I can do desktop audio. Closed captions on. I didn't turn the audio off today. No, I muted myself anyway. So. I think it's kind of a neat introduction generally. So when I talked about you characters and he said the label is UCF eight. So these are also ascii characters. Let me explain that. In a minute.
  • We.
  • Put in here. Oh yeah. A couple of things that I found interesting. Khan Academy. And it just. relate to the obstacles shredder Telegraph. But it's also. So if we say. This is. Zero or no. And this is. On or yes. or one. You can save everything everything's off you get the very lowest value. And then we can turn on. Sophie add one to this. To. First, one is zero and then one and then two. And then three. And then four. and five. Little pause there. 15. eight.
  • 910 11.
  • So. The tree the tree that branches. here. See. And looks at CD enough. that's enough to describe the cards that we have and then relationship between our choices. So if we added one more card then we're adding another whole tree above it. And if we had a second card. Then we had the we have this tree that's duplicated above it. I just started, we need to click and play around with. play around with. A nice interactive example of how things work. And they talked about, yes, yes or no choices. mean. Their choices, but the choice to Americans what number what. What values we associated with the code that we create. So. The choices are made at random, based on the data we're trying to communicate. And they're made based on. The format of the data that we have. Or the encoding format for the data. And that's different depending on. What we're trying to encode so characters. Can you represent as unicode. So takes four Bytes. For UCF aid is a. Is a format for encoding UCF pardon me. etfs dancer unicode transformation. or matt. I think so look up that p. But the EU is unicode so it's a format for transmitting unicode characters. So we saw. When we have ascii we need only eight that's actually only need seven that's right if we go back to that video. Let me go back to the video. audio turned off. The video. that's funny isn't it. See you tf eight so it's not wrong to say that zte f8 but it's also not wrong say this is asking. Because it's seven bets right the eighth that. The leftmost that is also or visceral zero so there's no information they're. afraid to write these and hexadecimal. That would be. That makes sense. You have to know that C is for three and hex. So, because these zeros in front, that means. The. encoding and UCF eight. And so, so ascii. So what's the what's the benefit of having ut f8. be able to store. represent characters. which would normally take four Bytes. In one bite. So that's a bit of savings isn't it. So you can use one quarter, the amount of bits to. transmit a message in YouTube potentially your savings can be up to four. You can save up to. That much. Which is pretty substantial if you're sending a lot message. So what happens at UCF aid means we can.
  • You.
  • The unit variable length and coding. And we can use. Eight units or eight. Eight bit or one bite units. So another form another popular encoding which predated etfs eight is UCF 16. So that way they use. A 16 better to buy a unit. So they can be there one or two. Of those units. to think about. If there's one ut FA 16 pardon me. Word a 16 bit word that refers to the it's called the basic multilingual plane. So that gives you another. A basic selection of international characters. Then we can go through different planes. It turns out, the way that etfs 16 is done. It. It doesn't allow for all. For encoding of all the four byte options from YouTube from unicode. So UCF aid. For that, but in order to maintain compatibility. UCF eight and 16. represents the same number of characters. Can squeeze up more characters are using etfs eight. That makes sense, so far. An example. If the code points if the character code points are. Between. Are are up to seven. That means seven will be 11101111111. Right, so this is seven F is just. All ones, except for leading zero so the seven that Sir all set to one. So if we can do that, then we encoder with is zero and the first so that's the same as asking. So UCF eight has the advantage of keeping ascii. And it's asking for compatibility. People speaking North and writing in North American English. that's a good compromise. But we also have other characters available. So, if we add. So you can see how we're adding. If we have a to bite if we need to use two bites, then we have 110 as the starter, the first light and one zero as a starter, the second bite. So that leads us to 11 bits didn't fit into. The fit. The data that we need to. transmit. So we have not more than 11 bits. Well, we have between eight and 11 bits, because if we have seven we have one to seven that's needed, then we can use this first one. So 811 debts fits in this form. And then between. 12 and 16 bits. that's here so notice that we're adding. Another one here, so this is three rights. And this indicates the trailing lights. The. Second, one here in a second and third ones, and then we have four Bytes. And we have for one sister with. Paul by zero. limits. So we can only store three that's. enough for first bite. And then we get six and the other three so we have. six plus six plus six plus three is 21 that's capacity for. Our story and we're sending for. Four Bytes and UCF eight. So, as you can see. You. can see that all right let's see if I can see. Okay, so i'm going to attempt to do this on my tablet here. See how seamlessly I can do this today. So that's in code and. So. I think it's 0089. convicted binary. So let's do that. Anyone want to share that with me. Who wants to turn the microphone on and tech give me the binary for nine. don't ever run a race to it at once. Let me check. Eight plus one. And he.
  • 1110.
  • right because he is. One short of 1415. You don't really need to look at those because. How many non zero guts do we have to encode. Like we don't have to worry about. So we know first of all. know that it's, we know that it's less than equal to seven. Because we're only using two positions and right so. He nine is less than FF even nevermind the seven part. Okay, so we can encode this in two bites. So then let's take a look at the. So I wasn't a little bit. Making you work for it on the worksheet here. So here's the required form. Invite to. So we know we we can do it in two ways, so the first plate looks like 110 followed by five data that's the second by it looks like 104 by six date of that. So if we start from the least significant bit and start copying, so we need how many bits for this spite. Six bits. take off for these. Two more. 1011. That makes sense. And then we take five bets. Here we only have two so we'll pad them with zeros in front. Okay, so let's convert to hexadecimal for convenience. and practice. nine year. And it has an eight plus 210 here right. What do we have next. Anyone. Tell me that one. yeah. to three, and then we have. An eight plus four is. Which is. He EC. Okay, so we're still saving. Saving to transmit this. Beautiful the actual entity character. That makes sense. If you type in C 389. into Google. I bet there's going to be a. you'll see a character that looks like he was an accent. Try try that as well. Where a. It comes up with a file format for me anyway well formatted info unicode character Latin small letter E with a cute. Okay, so let's go ahead and. Try another one. more complicated. cried out one. So how many how many X digits do we have 12345. So you know it's got to be bigger than this one, because there's a limit of four digits here. But try. fits into this one one. Because we have. Five, and then we have an extra debt. So there's room for us with 21 bets. That make sense, so, first of all let's convert to binary. So there's No eight for four and a to a snake six. And then D is. 13. So we need an eight. And a four. And one. And then. That makes sense. so far. you're working on a different screen and you're sharing I didn't think so. not kept up. Now, which is never a good sign. But I have. I have a cord ready to go here and see if I can save this. Trust this computer message, but i'm not. Let me try and go back to the. Wireless connection. OK, so now, you can see the. data. But, not the other one then running out of room to right. So we have four so 123440 that leaves us three bits. Okay, so now we're going to write in. The six over here.
  • 010101.

Zoom Chat Transcript

  • good afternoon
  • good
  • how r u
  • Worked for me
  • it worked for me too
  • Me too
  • its working fine for me too
  • I am in it right now and hope to compare the two in my posts for each lecture in this class
  • ya I did
  • very informative
  • 1011
  • 1001
  • 1001
  • 1001
  • 1110
  • yes
  • 3
  • 3
  • Are you working on a different screen than you are sharing?
  • we still see the Hex: C3A9 screen
  • should the last 4 bits not be 0110 when you convert from 2A6D6?
  • smiley face
  • so semi colon and right parenthesis?
  • *colon
  • 6 is 0110
  • Final answer should be F09F 988A
  • sounds good.
  • thanks
  • Thanks!
  • thank you!
  • thanks
  • thanks.
  • Thank you
  • Thank you!
  • See you Thursday!

Responses

Wiki

Link to the UR Courses wiki page for this meeting