Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New implementation #8

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,19 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project
adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [0.2.3] - 2020-07-07
## [Unreleased]

### ADDED
- `SmartString` now supports null pointer optimizations. `Option<SmartString>` is now the same size as `SmartString`.
- A feature flag `lazy_null_pointer_optimizations`, which enables null pointer optimizations for `SmartString<LazyCompact>`. On by default.

### FIXED
- `SmartString` now uses the size or capacity field to store the discriminant bit, instead of relying on pointer alignment (#4)
- `SmartString` doesn't rely on the internal layout of `String` (#4)

## [0.2.3] - 2020-07-07

### ADDED
- `SmartString` now implements `Display`. (#6)
- `SmartString` now implements `FromIterator<char>`.
- Support for [`serde`](https://serde.rs/) behind the `serde` feature flag. (#2)
Expand Down
2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ name = "smartstring"
harness = false

[features]
default = ["lazy_null_pointer_optimizations"]
test = ["arbitrary", "arbitrary/derive"]
lazy_null_pointer_optimizations = []

[dependencies]
static_assertions = "1.1.0"
Expand Down
173 changes: 142 additions & 31 deletions src/boxed.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,55 +2,166 @@
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.

use std::cmp::Ordering;
use crate::{inline::InlineString, SmartString, SmartStringMode};
use std::ops::{Deref, DerefMut};

pub trait BoxedString {
fn string(&self) -> &String;
fn string_mut(&mut self) -> &mut String;
fn into_string(self) -> String;
pub trait BoxedString: Deref<Target = str> + DerefMut + Into<String> {
//This is unsafe when null pointer optimizations are used with LazyCompact
//Then, it is unsound if the capacity of the string is 0
unsafe fn from_string_unchecked(string: String) -> Self;
fn capacity(&self) -> usize;
}

//Just a string, but the fields are in fixed order
#[cfg(target_endian = "big")]
#[repr(C)]
#[derive(Debug)]
pub struct PseudoString {
capacity: usize,
ptr: std::ptr::NonNull<u8>,
size: usize,
}

fn cmp_with_str(&self, other: &str) -> Ordering;
fn cmp_with_self(&self, other: &Self) -> Ordering;
fn eq_with_str(&self, other: &str) -> bool;
fn eq_with_self(&self, other: &Self) -> bool;
#[cfg(target_endian = "little")]
#[cfg(not(feature = "lazy_null_pointer_optimizations"))]
#[repr(C)]
#[derive(Debug)]
//This seems to be the most common arrangement of std::String
//However, with lazy null pointer optimizations, this arrangement does not work
pub struct PseudoString {
ptr: std::ptr::NonNull<u8>,
capacity: usize,
size: usize,
}

fn len(&self) -> usize {
self.string().len()
#[cfg(target_endian = "little")]
#[cfg(feature = "lazy_null_pointer_optimizations")]
#[repr(C)]
#[derive(Debug)]
pub struct PseudoString {
ptr: std::ptr::NonNull<u8>,
size: usize,
capacity: std::num::NonZeroUsize,
}

impl Deref for PseudoString {
type Target = str;
fn deref(&self) -> &str {
unsafe {
let slice = std::slice::from_raw_parts(self.ptr.as_ptr().cast(), self.size);
std::str::from_utf8_unchecked(slice)
}
}
}

impl BoxedString for String {
#[inline(always)]
fn string(&self) -> &String {
self
impl DerefMut for PseudoString {
fn deref_mut(&mut self) -> &mut str {
unsafe {
let slice = std::slice::from_raw_parts_mut(self.ptr.as_ptr().cast(), self.size);
std::str::from_utf8_unchecked_mut(slice)
}
}
}

impl From<PseudoString> for String {
#[inline(always)]
fn string_mut(&mut self) -> &mut String {
self
fn from(string: PseudoString) -> Self {
unsafe {
String::from_raw_parts(
string.ptr.as_ptr(),
string.size,
usize::from(string.capacity),
)
}
}
}

#[cfg(feature = "lazy_null_pointer_optimizations")]
unsafe fn to_capacity(size: usize) -> std::num::NonZeroUsize {
std::num::NonZeroUsize::new_unchecked(size)
}

#[cfg(not(feature = "lazy_null_pointer_optimizations"))]
fn to_capacity(size: usize) -> usize {
size
}

impl BoxedString for PseudoString {
unsafe fn from_string_unchecked(mut string: String) -> Self {
//into_raw_parts is nightly at the time of writing
//In the future the following code should be replaced with
//let (ptr, size, capacity) = string.into_raw_parts();
let capacity = string.capacity();
let bytes = string.as_mut_str();
let ptr = bytes.as_mut_ptr();
let size = bytes.len();
std::mem::forget(string);

Self {
ptr: std::ptr::NonNull::new_unchecked(ptr),
size,
capacity: to_capacity(capacity),
}
}

fn into_string(self) -> String {
self
fn capacity(&self) -> usize {
usize::from(self.capacity)
}
}

#[inline(always)]
fn cmp_with_str(&self, other: &str) -> Ordering {
self.as_str().cmp(other)
#[derive(Debug)]
pub(crate) struct StringReference<'a, Mode: SmartStringMode> {
referrant: &'a mut SmartString<Mode>,
string: String,
}

impl<'a, Mode: SmartStringMode> StringReference<'a, Mode> {
//Safety: Discriminant must be boxed
pub(crate) unsafe fn from_smart_unchecked(smart: &'a mut SmartString<Mode>) -> Self {
debug_assert_eq!(
smart.discriminant(),
crate::marker_byte::Discriminant::Boxed
);
let boxed: Mode::BoxedString = std::mem::transmute_copy(smart);
let string = boxed.into();
Self {
referrant: smart,
string,
}
}
}

#[inline(always)]
fn cmp_with_self(&self, other: &Self) -> Ordering {
self.cmp(other)
impl<'a, Mode: SmartStringMode> Drop for StringReference<'a, Mode> {
fn drop(&mut self) {
let string = std::mem::replace(&mut self.string, String::new());
if (Mode::DEALLOC && string.len() <= Mode::MAX_INLINE)
|| (!Mode::DEALLOC && cfg!(lazy_null_pointer_optimizations) && string.capacity() == 0)
{
let transmuted = (self as *mut Self).cast();
unsafe {
std::ptr::write(*transmuted, InlineString::<Mode>::from(string.as_bytes()));
}
} else {
let transmuted = (self as *mut Self).cast();
unsafe {
std::ptr::write(
*transmuted,
Mode::BoxedString::from_string_unchecked(string),
);
}
}
}
}

#[inline(always)]
fn eq_with_str(&self, other: &str) -> bool {
self == other
impl<'a, Mode: SmartStringMode> Deref for StringReference<'a, Mode> {
type Target = String;
fn deref(&self) -> &String {
&self.string
}
}

#[inline(always)]
fn eq_with_self(&self, other: &Self) -> bool {
self == other
impl<'a, Mode: SmartStringMode> DerefMut for StringReference<'a, Mode> {
fn deref_mut(&mut self) -> &mut String {
&mut self.string
}
}
12 changes: 10 additions & 2 deletions src/casts.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,27 @@
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.

use crate::{inline::InlineString, SmartStringMode};
use crate::{boxed::StringReference, inline::InlineString, SmartStringMode};

pub(crate) enum StringCast<'a, Mode: SmartStringMode> {
Boxed(&'a Mode::BoxedString),
Inline(&'a InlineString<Mode>),
}

pub(crate) enum StringCastMut<'a, Mode: SmartStringMode> {
Boxed(&'a mut Mode::BoxedString),
Boxed(StringReference<'a, Mode>),
Inline(&'a mut InlineString<Mode>),
}

pub(crate) enum StringCastInto<Mode: SmartStringMode> {
Boxed(Mode::BoxedString),
Inline(InlineString<Mode>),
}

//Same as transmute, except it doesn't check for same size
//This should be replaced when it's possible to constrain the sizes of generic associated types
pub(crate) unsafe fn please_transmute<A, B>(from: A) -> B {
let ret = std::mem::transmute_copy(&from);
std::mem::forget(from);
ret
}
Loading